| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Moyster <oysterized@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(cherry picked from commit b277da0a8a594308e17881f4926879bd5fca2a2d)
Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set
QUEUE_FLAG_NONROT.
Historically, all block devices have automatically made entropy
contributions. But as previously stated in commit e2e1a148 ("block: add
sysfs knob for turning off disk entropy contributions"):
- On SSD disks, the completion times aren't as random as they
are for rotational drives. So it's questionable whether they
should contribute to the random pool in the first place.
- Calling add_disk_randomness() has a lot of overhead.
There are more reliable sources for randomness than non-rotational block
devices. From a security perspective it is better to err on the side of
caution than to allow entropy contributions from unreliable "random"
sources.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Mister Oyster <oysterized@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
response
commit fdb7cee3b9e3c561502e58137a837341f10cbf8b upstream.
At the default trace level, we only trace unsuccessful events including
FSF responses.
zfcp_dbf_hba_fsf_response() only used protocol status and FSF status to
decide on an unsuccessful response. However, this is only one of multiple
possible sources determining a failed struct zfcp_fsf_req.
An FSF request can also "fail" if its response runs into an ERP timeout
or if it gets dismissed because a higher level recovery was triggered
[trace tags "erscf_1" or "erscf_2" in zfcp_erp_strategy_check_fsfreq()].
FSF requests with ERP timeout are:
FSF_QTCB_EXCHANGE_CONFIG_DATA, FSF_QTCB_EXCHANGE_PORT_DATA,
FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT or
FSF_QTCB_CLOSE_PHYSICAL_PORT for target ports,
FSF_QTCB_OPEN_LUN, FSF_QTCB_CLOSE_LUN.
One example is slow queue processing which can cause follow-on errors,
e.g. FSF_PORT_ALREADY_OPEN after FSF_QTCB_OPEN_PORT_WITH_DID timed out.
In order to see the root cause, we need to see late responses even if the
channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD.
Example trace records formatted with zfcpdbf from the s390-tools package:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : ...
Record ID : 1
Tag : fcegpf1
LUN : 0xffffffffffffffff
WWPN : 0x<WWPN>
D_ID : 0x00<D_ID>
Adapter status : 0x5400050b
Port status : 0x41200000
LUN status : 0x00000000
Ready count : 0x00000001
Running count : 0x...
ERP want : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT
ERP need : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT
|
Timestamp : ... 30 seconds later
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : ...
Record ID : 2
Tag : erscf_2
LUN : 0xffffffffffffffff
WWPN : 0x<WWPN>
D_ID : 0x00<D_ID>
Adapter status : 0x5400050b
Port status : 0x41200000
LUN status : 0x00000000
Request ID : 0x<request_ID>
ERP status : 0x10000000 ZFCP_STATUS_ERP_TIMEDOUT
ERP step : 0x0800 ZFCP_ERP_STEP_PORT_OPENING
ERP action : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT
ERP count : 0x00
|
Timestamp : ... later than previous record
Area : HBA
Subarea : 00
Level : 5 > default level => 3 <= default level
Exception : -
CPU ID : 00
Caller : ...
Record ID : 1
Tag : fs_qtcb => fs_rerr
Request ID : 0x<request_ID>
Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED
| ZFCP_STATUS_FSFREQ_CLEANUP
FSF cmnd : 0x00000005
FSF sequence no: 0x...
FSF issued : ... > 30 seconds ago
FSF stat : 0x00000000 FSF_GOOD
FSF stat qual : 00000000 00000000 00000000 00000000
Prot stat : 0x00000001 FSF_PROT_GOOD
Prot stat qual : 00000000 00000000 00000000 00000000
Port handle : 0x...
LUN handle : 0x00000000
QTCB log length: ...
QTCB log info : ...
In case of problems detecting that new responses are waiting on the input
queue, we sooner or later trigger adapter recovery due to an FSF request
timeout (trace tag "fsrth_1").
FSF requests with FSF request timeout are:
typically FSF_QTCB_ABORT_FCP_CMND; but theoretically also
FSF_QTCB_EXCHANGE_CONFIG_DATA or FSF_QTCB_EXCHANGE_PORT_DATA via sysfs,
FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT for WKA ports,
FSF_QTCB_FCP_CMND for task management function (LUN / target reset).
One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD
because the channel filled in the response via DMA into the request's QTCB.
In a theroretical case, inject code can create an erroneous FSF request
on purpose. If data router is enabled, it uses deferred error reporting.
A READ SCSI command can succeed with FSF_PROT_GOOD, FSF_GOOD, and
SAM_STAT_GOOD. But on writing the read data to host memory via DMA,
it can still fail, e.g. if an intentionally wrong scatter list does not
provide enough space. Rather than getting an unsuccessful response,
we get a QDIO activate check which in turn triggers adapter recovery.
One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD
because the channel filled in the response via DMA into the request's QTCB.
Example trace records formatted with zfcpdbf from the s390-tools package:
Timestamp : ...
Area : HBA
Subarea : 00
Level : 6 > default level => 3 <= default level
Exception : -
CPU ID : ..
Caller : ...
Record ID : 1
Tag : fs_norm => fs_rerr
Request ID : 0x<request_ID2>
Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED
| ZFCP_STATUS_FSFREQ_CLEANUP
FSF cmnd : 0x00000001
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000000 FSF_GOOD
FSF stat qual : 00000000 00000000 00000000 00000000
Prot stat : 0x00000001 FSF_PROT_GOOD
Prot stat qual : ........ ........ 00000000 00000000
Port handle : 0x...
LUN handle : 0x...
|
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 3
Exception : -
CPU ID : ..
Caller : ...
Record ID : 1
Tag : rsl_err
Request ID : 0x<request_ID2>
SCSI ID : 0x...
SCSI LUN : 0x...
SCSI result : 0x000e0000 DID_TRANSPORT_DISRUPTED
SCSI retries : 0x00
SCSI allowed : 0x05
SCSI scribble : 0x<request_ID2>
SCSI opcode : 28... Read(10)
FCP rsp inf cod: 0x00
FCP rsp IU : 00000000 00000000 00000000 00000000
^^ SAM_STAT_GOOD
00000000 00000000
Only with luck in both above cases, we could see a follow-on trace record
of an unsuccesful event following a successful but late FSF response with
FSF_PROT_GOOD and FSF_GOOD. Typically this was the case for I/O requests
resulting in a SCSI trace record "rsl_err" with DID_TRANSPORT_DISRUPTED
[On ZFCP_STATUS_FSFREQ_DISMISSED, zfcp_fsf_protstatus_eval() sets
ZFCP_STATUS_FSFREQ_ERROR seen by the request handler functions as failure].
However, the reason for this follow-on trace was invisible because the
corresponding HBA trace record was missing at the default trace level
(by default hidden records with tags "fs_norm", "fs_qtcb", or "fs_open").
On adapter recovery, after we had shut down the QDIO queues, we perform
unsuccessful pseudo completions with flag ZFCP_STATUS_FSFREQ_DISMISSED
for each pending FSF request in zfcp_fsf_req_dismiss_all().
In order to find the root cause, we need to see all pseudo responses even
if the channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD.
Therefore, check zfcp_fsf_req.status for ZFCP_STATUS_FSFREQ_DISMISSED
or ZFCP_STATUS_FSFREQ_ERROR and trace with a new tag "fs_rerr".
It does not matter that there are numerous places which set
ZFCP_STATUS_FSFREQ_ERROR after the location where we trace an FSF response
early. These cases are based on protocol status != FSF_PROT_GOOD or
== FSF_PROT_FSF_STATUS_PRESENTED and are thus already traced by default
as trace tag "fs_perr" or "fs_ferr" respectively.
NB: The trace record with tag "fssrh_1" for status read buffers on dismiss
all remains. zfcp_fsf_req_complete() handles this and returns early.
All other FSF request types are handled separately and as described above.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8a36e4532ea1 ("[SCSI] zfcp: enhancement of zfcp debug features")
Fixes: 2e261af84cdb ("[SCSI] zfcp: Only collect FSF/HBA debug data for matching trace levels")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 12c3e5754c8022a4f2fd1e9f00d19e99ee0d3cc1 upstream.
If the FCP_RSP UI has optional parts (FCP_SNS_INFO or FCP_RSP_INFO) and
thus does not fit into the fsp_rsp field built into a SCSI trace record,
trace the full FCP_RSP UI with all optional parts as payload record
instead of just FCP_SNS_INFO as payload and
a 1 byte RSP_INFO_CODE part of FCP_RSP_INFO built into the SCSI record.
That way we would also get the full FCP_SNS_INFO in case a
target would ever send more than
min(SCSI_SENSE_BUFFERSIZE==96, ZFCP_DBF_PAY_MAX_REC==256)==96.
The mandatory part of FCP_RSP IU is only 24 bytes.
PAYload costs at least one full PAY record of 256 bytes anyway.
We cap to the hardware response size which is only FSF_FCP_RSP_SIZE==128.
So we can just put the whole FCP_RSP IU with any optional parts into
PAYload similarly as we do for SAN PAY since v4.9 commit aceeffbb59bb
("zfcp: trace full payload of all SAN records (req,resp,iels)").
This does not cause any additional trace records wasting memory.
Decoded trace records were confusing because they showed a hard-coded
sense data length of 96 even if the FCP_RSP_IU field FCP_SNS_LEN showed
actually less.
Since the same commit, we set pl_len for SAN traces to the full length of a
request/response even if we cap the corresponding trace.
In contrast, here for SCSI traces we set pl_len to the pre-computed
length of FCP_RSP IU considering SNS_LEN or RSP_LEN if valid.
Nonetheless we trace a hardcoded payload of length FSF_FCP_RSP_SIZE==128
if there were optional parts.
This makes it easier for the zfcpdbf tool to format only the relevant
part of the long FCP_RSP UI buffer. And any trailing information is still
available in the payload trace record just in case.
Rename the payload record tag from "fcp_sns" to "fcp_riu" to make the new
content explicit to zfcpdbf which can then pick a suitable field name such
as "FCP rsp IU all:" instead of "Sense info :"
Also, the same zfcpdbf can still be backwards compatible with "fcp_sns".
Old example trace record before this fix, formatted with the tool zfcpdbf
from s390-tools:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 3
Exception : -
CPU id : ..
Caller : 0x...
Record id : 1
Tag : rsl_err
Request id : 0x<request_id>
SCSI ID : 0x...
SCSI LUN : 0x...
SCSI result : 0x00000002
SCSI retries : 0x00
SCSI allowed : 0x05
SCSI scribble : 0x<request_id>
SCSI opcode : 00000000 00000000 00000000 00000000
FCP rsp inf cod: 0x00
FCP rsp IU : 00000000 00000000 00000202 00000000
^^==FCP_SNS_LEN_VALID
00000020 00000000
^^^^^^^^==FCP_SNS_LEN==32
Sense len : 96 <==min(SCSI_SENSE_BUFFERSIZE,ZFCP_DBF_PAY_MAX_REC)
Sense info : 70000600 00000018 00000000 29000000
00000400 00000000 00000000 00000000
00000000 00000000 00000000 00000000<==superfluous
00000000 00000000 00000000 00000000<==superfluous
00000000 00000000 00000000 00000000<==superfluous
00000000 00000000 00000000 00000000<==superfluous
New example trace records with this fix:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 3
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : rsl_err
Request ID : 0x<request_id>
SCSI ID : 0x...
SCSI LUN : 0x...
SCSI result : 0x00000002
SCSI retries : 0x00
SCSI allowed : 0x03
SCSI scribble : 0x<request_id>
SCSI opcode : a30c0112 00000000 02000000 00000000
FCP rsp inf cod: 0x00
FCP rsp IU : 00000000 00000000 00000a02 00000200
00000020 00000000
FCP rsp IU len : 56
FCP rsp IU all : 00000000 00000000 00000a02 00000200
^^=FCP_RESID_UNDER|FCP_SNS_LEN_VALID
00000020 00000000 70000500 00000018
^^^^^^^^==FCP_SNS_LEN
^^^^^^^^^^^^^^^^^
00000000 240000cb 00011100 00000000
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
00000000 00000000
^^^^^^^^^^^^^^^^^==FCP_SNS_INFO
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : lr_okay
Request ID : 0x<request_id>
SCSI ID : 0x...
SCSI LUN : 0x...
SCSI result : 0x00000000
SCSI retries : 0x00
SCSI allowed : 0x05
SCSI scribble : 0x<request_id>
SCSI opcode : <CDB of unrelated SCSI command passed to eh handler>
FCP rsp inf cod: 0x00
FCP rsp IU : 00000000 00000000 00000100 00000000
00000000 00000008
FCP rsp IU len : 32
FCP rsp IU all : 00000000 00000000 00000100 00000000
^^==FCP_RSP_LEN_VALID
00000000 00000008 00000000 00000000
^^^^^^^^==FCP_RSP_LEN
^^^^^^^^^^^^^^^^^==FCP_RSP_INFO
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 1a5d999ebfc7bfe28deb48931bb57faa8e4102b6 upstream.
For problem determination we need to see that we were in scsi_eh
as well as whether and why we were successful or not.
The following commits introduced new early returns without adding
a trace record:
v2.6.35 commit a1dbfddd02d2
("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
on fc_block_scsi_eh() returning != 0 which is FAST_IO_FAIL,
v2.6.30 commit 63caf367e1c9
("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
on not having gotten an FSF request after the maximum number of retry
attempts and thus could not issue a TMF and has to return FAILED.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
Fixes: 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit a099b7b1fc1f0418ab8d79ecf98153e1e134656e upstream.
Up until now zfcp would just ignore the FCP_RESID_OVER flag in the FCP
response IU. When this flag is set, it is possible, in regards to the
FCP standard, that the storage-server processes the command normally, up
to the point where data is missing and simply ignores those.
In this case no CHECK CONDITION would be set, and because we ignored the
FCP_RESID_OVER flag we resulted in at least a data loss or even
-corruption as a follow-up error, depending on how the
applications/layers on top behave. To prevent this, we now set the
host-byte of the corresponding scsi_cmnd to DID_ERROR.
Other storage-behaviors, where the same condition results in a CHECK
CONDITION set in the answer, don't need to be changed as they are
handled in the mid-layer already.
Following is an example trace record decoded with zfcpdbf from the
s390-tools package. We forcefully injected a fc_dl which is one byte too
small:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 3
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : rsl_err
Request ID : 0x...
SCSI ID : 0x...
SCSI LUN : 0x...
SCSI result : 0x00070000
^^DID_ERROR
SCSI retries : 0x..
SCSI allowed : 0x..
SCSI scribble : 0x...
SCSI opcode : 2a000000 00000000 08000000 00000000
FCP rsp inf cod: 0x00
FCP rsp IU : 00000000 00000000 00000400 00000001
^^fr_flags==FCP_RESID_OVER
^^fr_status==SAM_STAT_GOOD
^^^^^^^^fr_resid
00000000 00000000
As of now, we don't actively handle to possibility that a response IU
has both flags - FCP_RESID_OVER and FCP_RESID_UNDER - set at once.
Reported-by: Luke M. Hopkins <lmhopkin@us.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 553448f6c483 ("[SCSI] zfcp: Message cleanup")
Fixes: ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git)
Cc: <stable@vger.kernel.org> #2.6.33+
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 71b8e45da51a7b64a23378221c0a5868bd79da4f upstream.
Since commit db007fc5e20c ("[SCSI] Command protection operation"),
scsi_eh_prep_cmnd() saves scmd->prot_op and temporarily resets it to
SCSI_PROT_NORMAL.
Other FCP LLDDs such as qla2xxx and lpfc shield their queuecommand()
to only access any of scsi_prot_sg...() if
(scsi_get_prot_op(cmd) != SCSI_PROT_NORMAL).
Do the same thing for zfcp, which introduced DIX support with
commit ef3eb71d8ba4 ("[SCSI] zfcp: Introduce experimental support for
DIF/DIX").
Otherwise, TUR SCSI commands as part of scsi_eh likely fail in zfcp,
because the regular SCSI command with DIX protection data, that scsi_eh
re-uses in scsi_send_eh_cmnd(), of course still has
(scsi_prot_sg_count() != 0) and so zfcp sends down bogus requests to the
FCP channel hardware.
This causes scsi_eh_test_devices() to have (finish_cmds == 0)
[not SCSI device is online or not scsi_eh_tur() failed]
so regular SCSI commands, that caused / were affected by scsi_eh,
are moved to work_q and scsi_eh_test_devices() itself returns false.
In turn, it unnecessarily escalates in our case in scsi_eh_ready_devs()
beyond host reset to finally scsi_eh_offline_sdevs()
which sets affected SCSI devices offline with the following kernel message:
"kernel: sd H:0:T:L: Device offlined - not ready after error recovery"
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: ef3eb71d8ba4 ("[SCSI] zfcp: Introduce experimental support for DIF/DIX")
Cc: <stable@vger.kernel.org> #2.6.36+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 5457e03de918f7a3e294eb9d26a608ab8a579976 upstream.
The buffer for iucv_message_receive() needs to be below 2 GB. In
__iucv_message_receive(), the buffer address is casted to an u32, which
would result in either memory corruption or an addressing exception when
using addresses >= 2 GB.
Fix this by using GFP_DMA for the buffer allocation.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 1e4a382fdc0ba8d1a85b758c0811de3a3631085e upstream.
For devices with multiple input queues, tiqdio_call_inq_handlers()
iterates over all input queues and clears the device's DSCI
during each iteration. If the DSCI is re-armed during one
of the later iterations, we therefore do not scan the previous
queues again.
The re-arming also raises a new adapter interrupt. But its
handler does not trigger a rescan for the device, as the DSCI
has already been erroneously cleared.
This can result in queue stalls on devices with multiple
input queues.
Fix it by clearing the DSCI just once, prior to scanning the queues.
As the code is moved in front of the loop, we also need to access
the DSCI directly (ie irq->dsci) instead of going via each queue's
parent pointer to the same irq. This is not a functional change,
and a follow-up patch will clean up the other users.
In practice, this bug only affects CQ-enabled HiperSockets devices,
ie. devices with sysfs-attribute "hsuid" set. Setting a hsuid is
needed for AF_IUCV socket applications that use HiperSockets
communication.
Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks")
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 2dfa6688aafdc3f74efeb1cf05fb871465d67f79 upstream.
Dan Carpenter kindly reported:
<quote>
The patch d27a7cb91960: "zfcp: trace on request for open and close of
WKA port" from Aug 10, 2016, leads to the following static checker
warning:
drivers/s390/scsi/zfcp_fsf.c:1615 zfcp_fsf_open_wka_port()
warn: 'req' was already freed.
drivers/s390/scsi/zfcp_fsf.c
1609 zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
1610 retval = zfcp_fsf_req_send(req);
1611 if (retval)
1612 zfcp_fsf_req_free(req);
^^^
Freed.
1613 out:
1614 spin_unlock_irq(&qdio->req_q_lock);
1615 if (req && !IS_ERR(req))
1616 zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req->req_id);
^^^^^^^^^^^
Use after free.
1617 return retval;
1618 }
Same thing for zfcp_fsf_close_wka_port() as well.
</quote>
Rather than relying on req being NULL (or ERR_PTR) for all cases where
we don't want to trace or should not trace,
simply check retval which is unconditionally initialized with -EIO != 0
and it can only become 0 on successful retval = zfcp_fsf_req_send(req).
With that we can also remove the then again unnecessary unconditional
initialization of req which was introduced with that earlier commit.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: d27a7cb91960 ("zfcp: trace on request for open and close of WKA port")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6f2ce1c6af37191640ee3ff6e8fc39ea10352f4c upstream.
It is unavoidable that zfcp_scsi_queuecommand() has to finish requests
with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time
window when zfcp detected an unavailable rport but
fc_remote_port_delete(), which is asynchronous via
zfcp_scsi_schedule_rport_block(), has not yet blocked the rport.
However, for the case when the rport becomes available again, we should
prevent unblocking the rport too early. In contrast to other FCP LLDDs,
zfcp has to open each LUN with the FCP channel hardware before it can
send I/O to a LUN. So if a port already has LUNs attached and we
unblock the rport just after port recovery, recoveries of LUNs behind
this port can still be pending which in turn force
zfcp_scsi_queuecommand() to unnecessarily finish requests with
DID_IMM_RETRY.
This also opens a time window with unblocked rport (until the followup
LUN reopen recovery has finished). If a scsi_cmnd timeout occurs during
this time window fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents a clean and
timely path failover. This should not happen if the path issue can be
recovered on FC transport layer such as path issues involving RSCNs.
Fix this by only calling zfcp_scsi_schedule_rport_register(), to
asynchronously trigger fc_remote_port_add(), after all LUN recoveries as
children of the rport have finished and no new recoveries of equal or
higher order were triggered meanwhile. Finished intentionally includes
any recovery result no matter if successful or failed (still unblock
rport so other successful LUNs work). For simplicity, we check after
each finished LUN recovery if there is another LUN recovery pending on
the same port and then do nothing. We handle the special case of a
successful recovery of a port without LUN children the same way without
changing this case's semantics.
For debugging we introduce 2 new trace records written if the rport
unblock attempt was aborted due to still unfinished or freshly triggered
recovery. The records are only written above the default trace level.
Benjamin noticed the important special case of new recovery that can be
triggered between having given up the erp_lock and before calling
zfcp_erp_action_cleanup() within zfcp_erp_strategy(). We must avoid the
following sequence:
ERP thread rport_work other context
------------------------- -------------- --------------------------------
port is unblocked, rport still blocked,
due to pending/running ERP action,
so ((port->status & ...UNBLOCK) != 0)
and (port->rport == NULL)
unlock ERP
zfcp_erp_action_cleanup()
case ZFCP_ERP_ACTION_REOPEN_LUN:
zfcp_erp_try_rport_unblock()
((status & ...UNBLOCK) != 0) [OLD!]
zfcp_erp_port_reopen()
lock ERP
zfcp_erp_port_block()
port->status clear ...UNBLOCK
unlock ERP
zfcp_scsi_schedule_rport_block()
port->rport_task = RPORT_DEL
queue_work(rport_work)
zfcp_scsi_rport_work()
(port->rport_task != RPORT_ADD)
port->rport_task = RPORT_NONE
zfcp_scsi_rport_block()
if (!port->rport) return
zfcp_scsi_schedule_rport_register()
port->rport_task = RPORT_ADD
queue_work(rport_work)
zfcp_scsi_rport_work()
(port->rport_task == RPORT_ADD)
port->rport_task = RPORT_NONE
zfcp_scsi_rport_register()
(port->rport == NULL)
rport = fc_remote_port_add()
port->rport = rport;
Now the rport was erroneously unblocked while the zfcp_port is blocked.
This is another situation we want to avoid due to scsi_eh
potential. This state would at least remain until the new recovery from
the other context finished successfully, or potentially forever if it
failed. In order to close this race, we take the erp_lock inside
zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or
LUN. With that, the possible corresponding rport state sequences would
be: (unblock[ERP thread],block[other context]) if the ERP thread gets
erp_lock first and still sees ((port->status & ...UNBLOCK) != 0),
(block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock
after the other context has already cleard ...UNBLOCK from port->status.
Since checking fields of struct erp_action is unsafe because they could
have been overwritten (re-used for new recovery) meanwhile, we only
check status of zfcp_port and LUN since these are only changed under
erp_lock elsewhere. Regarding the check of the proper status flags (port
or port_forced are similar to the shown adapter recovery):
[zfcp_erp_adapter_shutdown()]
zfcp_erp_adapter_reopen()
zfcp_erp_adapter_block()
* clear UNBLOCK ---------------------------------------+
zfcp_scsi_schedule_rports_block() |
write_lock_irqsave(&adapter->erp_lock, flags);-------+ |
zfcp_erp_action_enqueue() | |
zfcp_erp_setup_act() | |
* set ERP_INUSE -----------------------------------|--|--+
write_unlock_irqrestore(&adapter->erp_lock, flags);--+ | |
.context-switch. | |
zfcp_erp_thread() | |
zfcp_erp_strategy() | |
write_lock_irqsave(&adapter->erp_lock, flags);------+ | |
... | | |
zfcp_erp_strategy_check_target() | | |
zfcp_erp_strategy_check_adapter() | | |
zfcp_erp_adapter_unblock() | | |
* set UNBLOCK -----------------------------------|--+ |
zfcp_erp_action_dequeue() | |
* clear ERP_INUSE ---------------------------------|-----+
... |
write_unlock_irqrestore(&adapter->erp_lock, flags);-+
Hence, we should check for both UNBLOCK and ERP_INUSE because they are
interleaved. Also we need to explicitly check ERP_FAILED for the link
down case which currently does not clear the UNBLOCK flag in
zfcp_fsf_link_down_info_eval().
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8830271c4819 ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport")
Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 56d23ed7adf3974f10e91b643bd230e9c65b5f79 upstream.
Since quite a while, Linux issues enough SCSI commands per scsi_device
which successfully return with FCP_RESID_UNDER, FSF_FCP_RSP_AVAILABLE,
and SAM_STAT_GOOD. This floods the HBA trace area and we cannot see
other and important HBA trace records long enough.
Therefore, do not trace HBA response errors for pure benign residual
under counts at the default trace level.
This excludes benign residual under count combined with other validity
bits set in FCP_RSP_IU, such as FCP_SNS_LEN_VAL. For all those other
cases, we still do want to see both the HBA record and the corresponding
SCSI record by default.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit dac37e15b7d511e026a9313c8c46794c144103cd upstream.
When SCSI EH invokes zFCP's callbacks for eh_device_reset_handler() and
eh_target_reset_handler(), it expects us to relent the ownership over
the given scsi_cmnd and all other scsi_cmnds within the same scope - LUN
or target - when returning with SUCCESS from the callback ('release'
them). SCSI EH can then reuse those commands.
We did not follow this rule to release commands upon SUCCESS; and if
later a reply arrived for one of those supposed to be released commands,
we would still make use of the scsi_cmnd in our ingress tasklet. This
will at least result in undefined behavior or a kernel panic because of
a wrong kernel pointer dereference.
To fix this, we NULLify all pointers to scsi_cmnds (struct zfcp_fsf_req
*)->data in the matching scope if a TMF was successful. This is done
under the locks (struct zfcp_adapter *)->abort_lock and (struct
zfcp_reqlist *)->lock to prevent the requests from being removed from
the request-hashtable, and the ingress tasklet from making use of the
scsi_cmnd-pointer in zfcp_fsf_fcp_cmnd_handler().
For cases where a reply arrives during SCSI EH, but before we get a
chance to NULLify the pointer - but before we return from the callback
-, we assume that the code is protected from races via the CAS operation
in blk_complete_request() that is called in scsi_done().
The following stacktrace shows an example for a crash resulting from the
previous behavior:
Unable to handle kernel pointer dereference at virtual kernel address fffffee17a672000
Oops: 0038 [#1] SMP
CPU: 2 PID: 0 Comm: swapper/2 Not tainted
task: 00000003f7ff5be0 ti: 00000003f3d38000 task.ti: 00000003f3d38000
Krnl PSW : 0404d00180000000 00000000001156b0 (smp_vcpu_scheduled+0x18/0x40)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
Krnl GPRS: 000000200000007e 0000000000000000 fffffee17a671fd8 0000000300000015
ffffffff80000000 00000000005dfde8 07000003f7f80e00 000000004fa4e800
000000036ce8d8f8 000000036ce8d9c0 00000003ece8fe00 ffffffff969c9e93
00000003fffffffd 000000036ce8da10 00000000003bf134 00000003f3b07918
Krnl Code: 00000000001156a2: a7190000 lghi %r1,0
00000000001156a6: a7380015 lhi %r3,21
#00000000001156aa: e32050000008 ag %r2,0(%r5)
>00000000001156b0: 482022b0 lh %r2,688(%r2)
00000000001156b4: ae123000 sigp %r1,%r2,0(%r3)
00000000001156b8: b2220020 ipm %r2
00000000001156bc: 8820001c srl %r2,28
00000000001156c0: c02700000001 xilf %r2,1
Call Trace:
([<0000000000000000>] 0x0)
[<000003ff807bdb8e>] zfcp_fsf_fcp_cmnd_handler+0x3de/0x490 [zfcp]
[<000003ff807be30a>] zfcp_fsf_req_complete+0x252/0x800 [zfcp]
[<000003ff807c0a48>] zfcp_fsf_reqid_check+0xe8/0x190 [zfcp]
[<000003ff807c194e>] zfcp_qdio_int_resp+0x66/0x188 [zfcp]
[<000003ff80440c64>] qdio_kick_handler+0xdc/0x310 [qdio]
[<000003ff804463d0>] __tiqdio_inbound_processing+0xf8/0xcd8 [qdio]
[<0000000000141fd4>] tasklet_action+0x9c/0x170
[<0000000000141550>] __do_softirq+0xe8/0x258
[<000000000010ce0a>] do_softirq+0xba/0xc0
[<000000000014187c>] irq_exit+0xc4/0xe8
[<000000000046b526>] do_IRQ+0x146/0x1d8
[<00000000005d6a3c>] io_return+0x0/0x8
[<00000000005d6422>] vtime_stop_cpu+0x4a/0xa0
([<0000000000000000>] 0x0)
[<0000000000103d8a>] arch_cpu_idle+0xa2/0xb0
[<0000000000197f94>] cpu_startup_entry+0x13c/0x1f8
[<0000000000114782>] smp_start_secondary+0xda/0xe8
[<00000000005d6efe>] restart_int_handler+0x56/0x6c
[<0000000000000000>] 0x0
Last Breaking-Event-Address:
[<00000000003bf12e>] arch_spin_lock_wait+0x56/0xb0
Suggested-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: ea127f9754 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git)
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit e7cb08e894a0b876443ef8fdb0706575dc00a5d2 upstream.
We accidentally overwrite the original saved value of "flags" so that we
can't re-enable IRQs at the end of the function. Presumably this
function is mostly called with IRQs disabled or it would be obvious in
testing.
Fixes: aceeffbb59bb ("zfcp: trace full payload of all SAN records (req,resp,iels)")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit aceeffbb59bb91404a0bda32a542d7ebf878433a upstream.
This was lost with commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c
("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
but is necessary for problem determination, e.g. to see the
currently active zone set during automatic port scan.
For the large GPN_FT response (4 pages), save space by not dumping
any empty residual entries.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
Reviewed-by: Alexey Ishchuk <aishchuk@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 94db3725f049ead24c96226df4a4fb375b880a77 upstream.
commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c
("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
started to add FC_CT_HDR_LEN which made zfcp dump random data
out of bounds for RSPN GS responses because u.rspn.rsp
is the largest and last field in the union of struct zfcp_fc_req.
Other request/response types only happened to stay within bounds
due to the padding of the union or
due to the trace capping of u.gspn.rsp to ZFCP_DBF_SAN_MAX_PAYLOAD.
Timestamp : ...
Area : SAN
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : ...
Record id : 2
Tag : fsscth2
Request id : 0x...
Destination ID : 0x00fffffc
Payload short : 01000000 fc020000 80020000 00000000
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx <===
00000000 00000000 00000000 00000000
Payload length : 32 <===
struct zfcp_fc_req {
[0] struct zfcp_fsf_ct_els ct_els;
[56] struct scatterlist sg_req;
[96] struct scatterlist sg_rsp;
union {
struct {req; rsp;} adisc; SIZE: 28+28= 56
struct {req; rsp;} gid_pn; SIZE: 24+20= 44
struct {rspsg; req;} gpn_ft; SIZE: 40*4+20=180
struct {req; rsp;} gspn; SIZE: 20+273= 293
struct {req; rsp;} rspn; SIZE: 277+16= 293
[136] } u;
}
SIZE: 432
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
Reviewed-by: Alexey Ishchuk <aishchuk@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 771bf03537ddfa4a4dde62ef9dfbc82e4f77ab20 upstream.
With commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c
("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
we lost the N_Port-ID where an ELS response comes from.
With commit 7c7dc196814b9e1d5cc254dc579a5fa78ae524f7
("[SCSI] zfcp: Simplify handling of ct and els requests")
we lost the N_Port-ID where a CT response comes from.
It's especially useful if the request SAN trace record
with D_ID was already lost due to trace buffer wrap.
GS uses an open WKA port handle and ELS just a D_ID, and
only for ELS we could get D_ID from QTCB bottom via zfcp_fsf_req.
To cover both cases, add a new field to zfcp_fsf_ct_els
and fill it in on request to use in SAN response trace.
Strictly speaking the D_ID on SAN response is the FC frame's S_ID.
We don't need a field for the other end which is always us.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
Fixes: 7c7dc196814b ("[SCSI] zfcp: Simplify handling of ct and els requests")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 7c964ffe586bc0c3d9febe9bf97a2e4b2866e5b7 upstream.
This information was lost with
commit a54ca0f62f953898b05549391ac2a8a4dad6482b
("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
but is required to debug e.g. invalid handle situations.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit d27a7cb91960cf1fdd11b10071e601828cbf4b1f upstream.
Since commit a54ca0f62f953898b05549391ac2a8a4dad6482b
("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
HBA records no longer contain WWPN, D_ID, or LUN
to reduce duplicate information which is already in REC records.
In contrast to "regular" target ports, we don't use recovery to open
WKA ports such as directory/nameserver, so we don't get REC records.
Therefore, introduce pseudo REC running records without any
actual recovery action but including D_ID of WKA port on open/close.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 0102a30a6ff60f4bb4c07358ca3b1f92254a6c25 upstream.
bring back
commit d21e9daa63e009ce5b87bbcaa6d11ce48e07bbbe
("[SCSI] zfcp: Dont use 0 to indicate invalid LUN in rec trace")
which was lost with
commit ae0904f60fab7cb20c48d32eefdd735e478b91fb
("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.")
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 35f040df97fa0e94c7851c054ec71533c88b4b81 upstream.
While retaining the actual filtering according to trace level,
the following commits started to write such filtered records
with a hardcoded record level of 1 instead of the actual record level:
commit 250a1352b95e1db3216e5c5d4f4365bea5122f4a
("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
commit a54ca0f62f953898b05549391ac2a8a4dad6482b
("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Now we can distinguish written records again for offline level filtering.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 4eeaa4f3f1d6c47b69f70e222297a4df4743363e upstream.
On a successful end of reopen port forced,
zfcp_erp_strategy_followup_success() re-uses the port erp_action
and the subsequent zfcp_erp_action_cleanup() now
sees ZFCP_ERP_SUCCEEDED with
erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT
instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
but must not perform zfcp_scsi_schedule_rport_register().
We can detect this because the fresh port reopen erp_action
is in its very first step ZFCP_ERP_STEP_UNINITIALIZED.
Otherwise this opens a time window with unblocked rport
(until the followup port reopen recovery would block it again).
If a scsi_cmnd timeout occurs during this time window
fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents
a clean and timely path failover.
This should not happen if the path issue can be recovered
on FC transport layer such as path issues involving RSCNs.
Also, unnecessary and repeated DID_IMM_RETRY for pending and
undesired new requests occur because internally zfcp still
has its zfcp_port blocked.
As follow-on errors with scsi_eh, it can cause,
in the worst case, permanently lost paths due to one of:
sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk!
sd <scsidev>: Device offlined - not ready after error recovery
For fix validation and to aid future debugging with other recoveries
we now also trace (un)blocking of rports.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 5767620c383a ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED")
Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 70369f8e15b220f50a16348c79a61d3f7054813c upstream.
In the hardware data router case, introduced with kernel 3.2
commit 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router")
the ELS/GS request&response length needs to be initialized
as in the chained SBAL case.
Otherwise, the FCP channel rejects ELS requests with
FSF_REQUEST_SIZE_TOO_LARGE.
Such ELS requests can be issued by user space through BSG / HBA API,
or zfcp itself uses ADISC ELS for remote port link test on RSCN.
The latter can cause a short path outage due to
unnecessary remote target port recovery because the always
failing ADISC cannot detect extremely short path interruptions
beyond the local FCP channel.
Below example is decoded with zfcpdbf from s390-tools:
Timestamp : ...
Area : SAN
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : zfcp_dbf_san_req+0408
Record id : 1
Tag : fssels1
Request id : 0x<reqid>
Destination ID : 0x00<target d_id>
Payload info : 52000000 00000000 <our wwpn > [ADISC]
<our wwnn > 00<s_id> 00000000
00000000 00000000 00000000 00000000
Timestamp : ...
Area : HBA
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : zfcp_dbf_hba_fsf_res+0740
Record id : 1
Tag : fs_ferr
Request id : 0x<reqid>
Request status : 0x00000010
FSF cmnd : 0x0000000b [FSF_QTCB_SEND_ELS]
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000061 [FSF_REQUEST_SIZE_TOO_LARGE]
FSF stat qual : 00000000 00000000 00000000 00000000
Prot stat : 0x00000100
Prot stat qual : 00000000 00000000 00000000 00000000
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit bd77befa5bcff8c51613de271913639edf85fbc2 upstream.
For an NPIV-enabled FCP device, zfcp can erroneously show
"NPort (fabric via point-to-point)" instead of "NPIV VPORT"
for the port_type sysfs attribute of the corresponding
fc_host.
s390-tools that can be affected are dbginfo.sh and ziomon.
zfcp_fsf_exchange_config_evaluate() ignores
fsf_qtcb_bottom_config.connection_features indicating NPIV
and only sets fc_host_port_type to FC_PORTTYPE_NPORT if
fsf_qtcb_bottom_config.fc_topology is FSF_TOPO_FABRIC.
Only the independent zfcp_fsf_exchange_port_evaluate()
evaluates connection_features to overwrite fc_host_port_type
to FC_PORTTYPE_NPIV in case of NPIV.
Code was introduced with upstream kernel 2.6.30
commit 0282985da5923fa6365adcc1a1586ae0c13c1617
("[SCSI] zfcp: Report fc_host_port_type as NPIV").
This works during FCP device recovery (such as set online)
because it performs FSF_QTCB_EXCHANGE_CONFIG_DATA followed by
FSF_QTCB_EXCHANGE_PORT_DATA in sequence.
However, the zfcp-specific scsi host sysfs attributes
"requests", "megabytes", or "seconds_active" trigger only
zfcp_fsf_exchange_config_evaluate() resetting fc_host
port_type to FC_PORTTYPE_NPORT despite NPIV.
The zfcp-specific scsi host sysfs attribute "utilization"
triggers only zfcp_fsf_exchange_port_evaluate() correcting
the fc_host port_type again in case of NPIV.
Evaluate fsf_qtcb_bottom_config.connection_features
in zfcp_fsf_exchange_config_evaluate() where it belongs to.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 0282985da592 ("[SCSI] zfcp: Report fc_host_port_type as NPIV")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 9ba333dc55cbb9523553df973adb3024d223e905 upstream.
When a device is in a status where CIO has killed all I/O by itself the
interrupt for a clear request may not contain an irb to determine the
clear function. Instead it contains an error pointer -EIO.
This was ignored by the DASD int_handler leading to a hanging device
waiting for a clear interrupt.
Handle -EIO error pointer correctly for requests that are clear pending and
treat the clear as successful.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit e39c17904aadf3a107b2bc292c03bfd9f850fd08
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Thu Mar 3 15:07:51 2016 -0800
Linux 3.10.99
commit d012f71377e1dad1165a0926c2920043e4047438
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Thu Feb 11 16:10:26 2016 -0500
xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted.
commit 4d8c8bd6f2062c9988817183a91fe2e623c8aa5e upstream.
Occasionaly PV guests would crash with:
pciback 0000:00:00.1: Xen PCI mapped GSI0 to IRQ16
BUG: unable to handle kernel paging request at 0000000d1a8c0be0
.. snip..
<ffffffff8139ce1b>] find_next_bit+0xb/0x10
[<ffffffff81387f22>] cpumask_next_and+0x22/0x40
[<ffffffff813c1ef8>] pci_device_probe+0xb8/0x120
[<ffffffff81529097>] ? driver_sysfs_add+0x77/0xa0
[<ffffffff815293e4>] driver_probe_device+0x1a4/0x2d0
[<ffffffff813c1ddd>] ? pci_match_device+0xdd/0x110
[<ffffffff81529657>] __device_attach_driver+0xa7/0xb0
[<ffffffff815295b0>] ? __driver_attach+0xa0/0xa0
[<ffffffff81527622>] bus_for_each_drv+0x62/0x90
[<ffffffff8152978d>] __device_attach+0xbd/0x110
[<ffffffff815297fb>] device_attach+0xb/0x10
[<ffffffff813b75ac>] pci_bus_add_device+0x3c/0x70
[<ffffffff813b7618>] pci_bus_add_devices+0x38/0x80
[<ffffffff813dc34e>] pcifront_scan_root+0x13e/0x1a0
[<ffffffff817a0692>] pcifront_backend_changed+0x262/0x60b
[<ffffffff814644c6>] ? xenbus_gather+0xd6/0x160
[<ffffffff8120900f>] ? put_object+0x2f/0x50
[<ffffffff81465c1d>] xenbus_otherend_changed+0x9d/0xa0
[<ffffffff814678ee>] backend_changed+0xe/0x10
[<ffffffff81463a28>] xenwatch_thread+0xc8/0x190
[<ffffffff810f22f0>] ? woken_wake_function+0x10/0x10
which was the result of two things:
When we call pci_scan_root_bus we would pass in 'sd' (sysdata)
pointer which was an 'pcifront_sd' structure. However in the
pci_device_add it expects that the 'sd' is 'struct sysdata' and
sets the dev->node to what is in sd->node (offset 4):
set_dev_node(&dev->dev, pcibus_to_node(bus));
__pcibus_to_node(const struct pci_bus *bus)
{
const struct pci_sysdata *sd = bus->sysdata;
return sd->node;
}
However our structure was pcifront_sd which had nothing at that
offset:
struct pcifront_sd {
int domain; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
struct pcifront_device * pdev; /* 8 8 */
}
That is an hole - filled with garbage as we used kmalloc instead of
kzalloc (the second problem).
This patch fixes the issue by:
1) Use kzalloc to initialize to a well known state.
2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That
way access to the 'node' will access the right offset.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b153db5ac6c1598022b2b11ffb3be1303e3bffb
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Sat Feb 27 19:17:33 2016 -0500
do_last(): don't let a bogus return value from ->open() et.al. to confuse us
commit c80567c82ae4814a41287618e315a60ecf513be6 upstream.
... into returning a positive to path_openat(), which would interpret that
as "symlink had been encountered" and proceed to corrupt memory, etc.
It can only happen due to a bug in some ->open() instance or in some LSM
hook, etc., so we report any such event *and* make sure it doesn't trick
us into further unpleasantness.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b19e7a870c21104186d5834d469b6e4a60d5cc6a
Author: Simon Guinot <simon.guinot@sequanux.org>
Date: Thu Sep 10 00:15:18 2015 +0200
kernel/resource.c: fix muxed resource handling in __request_region()
commit 59ceeaaf355fa0fb16558ef7c24413c804932ada upstream.
In __request_region, if a conflict with a BUSY and MUXED resource is
detected, then the caller goes to sleep and waits for the resource to be
released. A pointer on the conflicting resource is kept. At wake-up
this pointer is used as a parent to retry to request the region.
A first problem is that this pointer might well be invalid (if for
example the conflicting resource have already been freed). Another
problem is that the next call to __request_region() fails to detect a
remaining conflict. The previously conflicting resource is passed as a
parameter and __request_region() will look for a conflict among the
children of this resource and not at the resource itself. It is likely
to succeed anyway, even if there is still a conflict.
Instead, the parent of the conflicting resource should be passed to
__request_region().
As a fix, this patch doesn't update the parent resource pointer in the
case we have to wait for a muxed region right after.
Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Tested-by: Vincent Donnefort <vdonnefort@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7d9970fb5419b78310fb827615bd14bf7d94963
Author: Stefan Hajnoczi <stefanha@redhat.com>
Date: Thu Feb 18 18:55:54 2016 +0000
sunrpc/cache: fix off-by-one in qword_get()
commit b7052cd7bcf3c1478796e93e3dff2b44c9e82943 upstream.
The qword_get() function NUL-terminates its output buffer. If the input
string is in hex format \xXXXX... and the same length as the output
buffer, there is an off-by-one:
int qword_get(char **bpp, char *dest, int bufsize)
{
...
while (len < bufsize) {
...
*dest++ = (h << 4) | l;
len++;
}
...
*dest = '\0';
return len;
}
This patch ensures the NUL terminator doesn't fall outside the output
buffer.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb63a905ff5f0d258693fd9991a94bc49188dcc3
Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Date: Wed Feb 24 09:04:24 2016 -0500
tracing: Fix showing function event in available_events
commit d045437a169f899dfb0f6f7ede24cc042543ced9 upstream.
The ftrace:function event is only displayed for parsing the function tracer
data. It is not used to enable function tracing, and does not include an
"enable" file in its event directory.
Originally, this event was kept separate from other events because it did
not have a ->reg parameter. But perf added a "reg" parameter for its use
which caused issues, because it made the event available to functions where
it was not compatible for.
Commit 9b63776fa3ca9 "tracing: Do not enable function event with enable"
added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event
from being enabled by normal trace events. But this commit missed keeping
the function event from being displayed by the "available_events" directory,
which is used to show what events can be enabled by set_event.
One documented way to enable all events is to:
cat available_events > set_event
But because the function event is displayed in the available_events, this
now causes an INVALID error:
cat: write error: Invalid argument
Reported-by: Chunyu Hu <chuhu@redhat.com>
Fixes: 9b63776fa3ca9 "tracing: Do not enable function event with enable"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd54a801362f13ac756b4de4bb65d1a48c7d5fad
Author: Christian Borntraeger <borntraeger@de.ibm.com>
Date: Fri Feb 19 13:11:46 2016 +0100
KVM: async_pf: do not warn on page allocation failures
commit d7444794a02ff655eda87e3cc54e86b940e7736f upstream.
In async_pf we try to allocate with NOWAIT to get an element quickly
or fail. This code also handle failures gracefully. Lets silence
potential page allocation failures under load.
qemu-system-s39: page allocation failure: order:0,mode:0x2200000
[...]
Call Trace:
([<00000000001146b8>] show_trace+0xf8/0x148)
[<000000000011476a>] show_stack+0x62/0xe8
[<00000000004a36b8>] dump_stack+0x70/0x98
[<0000000000272c3a>] warn_alloc_failed+0xd2/0x148
[<000000000027709e>] __alloc_pages_nodemask+0x94e/0xb38
[<00000000002cd36a>] new_slab+0x382/0x400
[<00000000002cf7ac>] ___slab_alloc.constprop.30+0x2dc/0x378
[<00000000002d03d0>] kmem_cache_alloc+0x160/0x1d0
[<0000000000133db4>] kvm_setup_async_pf+0x6c/0x198
[<000000000013dee8>] kvm_arch_vcpu_ioctl_run+0xd48/0xd58
[<000000000012fcaa>] kvm_vcpu_ioctl+0x372/0x690
[<00000000002f66f6>] do_vfs_ioctl+0x3be/0x510
[<00000000002f68ec>] SyS_ioctl+0xa4/0xb8
[<0000000000781c5e>] system_call+0xd6/0x264
[<000003ffa24fa06a>] 0x3ffa24fa06a
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c8e8b9e50d8e19ae8cbc721cd0a4da8af17fa26
Author: Christoph Hellwig <hch@lst.de>
Date: Mon Feb 8 21:11:50 2016 +0100
nfs: fix nfs_size_to_loff_t
commit 50ab8ec74a153eb30db26529088bc57dd700b24c upstream.
See http: //www.infradead.org/rpr.html
X-Evolution-Source: 1451162204.2173.11@leira.trondhjem.org
Content-Transfer-Encoding: 8bit
Mime-Version: 1.0
We support OFFSET_MAX just fine, so don't round down below it. Also
switch to using min_t to make the helper more readable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: 433c92379d9c ("NFS: Clean up nfs_size_to_loff_t()")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dcc121e02f773b4b8fc88522810214af39ea8313
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Mon Jan 25 10:08:00 2016 -0600
PCI/AER: Flush workqueue on device remove to avoid use-after-free
commit 4ae2182b1e3407de369f8c5d799543b7db74221b upstream.
A Root Port's AER structure (rpc) contains a queue of events. aer_irq()
enqueues AER status information and schedules aer_isr() to dequeue and
process it. When we remove a device, aer_remove() waits for the queue to
be empty, then frees the rpc struct.
But aer_isr() references the rpc struct after dequeueing and possibly
emptying the queue, which can cause a use-after-free error as in the
following scenario with two threads, aer_isr() on the left and a
concurrent aer_remove() on the right:
Thread A Thread B
-------- --------
aer_irq():
rpc->prod_idx++
aer_remove():
wait_event(rpc->prod_idx == rpc->cons_idx)
# now blocked until queue becomes empty
aer_isr(): # ...
rpc->cons_idx++ # unblocked because queue is now empty
... kfree(rpc)
mutex_unlock(&rpc->rpc_mutex)
To prevent this problem, use flush_work() to wait until the last scheduled
instance of aer_isr() has completed before freeing the rpc struct in
aer_remove().
I reproduced this use-after-free by flashing a device FPGA and
re-enumerating the bus to find the new device. With SLUB debug, this
crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in
GPR25:
pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000
Unable to handle kernel paging request for data at address 0x27ef9e3e
Workqueue: events aer_isr
GPR24: dd6aa000 6b6b6b6b 605f8378 605f8360 d99b12c0 604fc674 606b1704 d99b12c0
NIP [602f5328] pci_walk_bus+0xd4/0x104
[bhelgaas: changelog, stable tag]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 43a349917d77250e1fb7dbcb44e50a80b8cab026
Author: Tejun Heo <tj@kernel.org>
Date: Mon Feb 1 11:33:21 2016 -0500
libata: fix sff host state machine locking while polling
commit 8eee1d3ed5b6fc8e14389567c9a6f53f82bb7224 upstream.
The bulk of ATA host state machine is implemented by
ata_sff_hsm_move(). The function is called from either the interrupt
handler or, if polling, a work item. Unlike from the interrupt path,
the polling path calls the function without holding the host lock and
ata_sff_hsm_move() selectively grabs the lock.
This is completely broken. If an IRQ triggers while polling is in
progress, the two can easily race and end up accessing the hardware
and updating state machine state at the same time. This can put the
state machine in an illegal state and lead to a crash like the
following.
kernel BUG at drivers/ata/libata-sff.c:1302!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 1 PID: 10679 Comm: syz-executor Not tainted 4.5.0-rc1+ #300
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88002bd00000 ti: ffff88002e048000 task.ti: ffff88002e048000
RIP: 0010:[<ffffffff83a83409>] [<ffffffff83a83409>] ata_sff_hsm_move+0x619/0x1c60
...
Call Trace:
<IRQ>
[<ffffffff83a84c31>] __ata_sff_port_intr+0x1e1/0x3a0 drivers/ata/libata-sff.c:1584
[<ffffffff83a85611>] ata_bmdma_port_intr+0x71/0x400 drivers/ata/libata-sff.c:2877
[< inline >] __ata_sff_interrupt drivers/ata/libata-sff.c:1629
[<ffffffff83a85bf3>] ata_bmdma_interrupt+0x253/0x580 drivers/ata/libata-sff.c:2902
[<ffffffff81479f98>] handle_irq_event_percpu+0x108/0x7e0 kernel/irq/handle.c:157
[<ffffffff8147a717>] handle_irq_event+0xa7/0x140 kernel/irq/handle.c:205
[<ffffffff81484573>] handle_edge_irq+0x1e3/0x8d0 kernel/irq/chip.c:623
[< inline >] generic_handle_irq_desc include/linux/irqdesc.h:146
[<ffffffff811a92bc>] handle_irq+0x10c/0x2a0 arch/x86/kernel/irq_64.c:78
[<ffffffff811a7e4d>] do_IRQ+0x7d/0x1a0 arch/x86/kernel/irq.c:240
[<ffffffff86653d4c>] common_interrupt+0x8c/0x8c arch/x86/entry/entry_64.S:520
<EOI>
[< inline >] rcu_lock_acquire include/linux/rcupdate.h:490
[< inline >] rcu_read_lock include/linux/rcupdate.h:874
[<ffffffff8164b4a1>] filemap_map_pages+0x131/0xba0 mm/filemap.c:2145
[< inline >] do_fault_around mm/memory.c:2943
[< inline >] do_read_fault mm/memory.c:2962
[< inline >] do_fault mm/memory.c:3133
[< inline >] handle_pte_fault mm/memory.c:3308
[< inline >] __handle_mm_fault mm/memory.c:3418
[<ffffffff816efb16>] handle_mm_fault+0x2516/0x49a0 mm/memory.c:3447
[<ffffffff8127dc16>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238
[<ffffffff8127e358>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331
[<ffffffff8126f514>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264
[<ffffffff86655578>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986
Fix it by ensuring that the polling path is holding the host lock
before entering ata_sff_hsm_move() so that all hardware accesses and
state updates are performed under the host lock.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/g/CACT4Y+b_JsOxJu2EZyEf+mOXORc_zid5V1-pLZSroJVxyWdSpw@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49662cfccb3f4ade7beb77a1f67f2c2db0027edc
Author: Tejun Heo <tj@kernel.org>
Date: Tue Feb 9 16:11:26 2016 -0500
Revert "workqueue: make sure delayed work run in local cpu"
commit 041bd12e272c53a35c54c13875839bcb98c999ce upstream.
This reverts commit 874bbfe600a660cba9c776b3957b1ce393151b76.
Workqueue used to implicity guarantee that work items queued without
explicit CPU specified are put on the local CPU. Recent changes in
timer broke the guarantee and led to vmstat breakage which was fixed
by 176bed1de5bf ("vmstat: explicitly schedule per-cpu work on the CPU
we need it to run on").
vmstat is the most likely to expose the issue and it's quite possible
that there are other similar problems which are a lot more difficult
to trigger. As a preventive measure, 874bbfe600a6 ("workqueue: make
sure delayed work run in local cpu") was applied to restore the local
CPU guarnatee. Unfortunately, the change exposed a bug in timer code
which got fixed by 22b886dd1018 ("timers: Use proper base migration in
add_timer_on()"). Due to code restructuring, the commit couldn't be
backported beyond certain point and stable kernels which only had
874bbfe600a6 started crashing.
The local CPU guarantee was accidental more than anything else and we
want to get rid of it anyway. As, with the vmstat case fixed,
874bbfe600a6 is causing more problems than it's fixing, it has been
decided to take the chance and officially break the guarantee by
reverting the commit. A debug feature will be added to force foreign
CPU assignment to expose cases relying on the guarantee and fixes for
the individual cases will be backported to stable as necessary.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu")
Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Shaohua Li <shli@fb.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 01e527fd4c73d05125d04ce3bfd4413eb5af2581
Author: Johannes Berg <johannes.berg@intel.com>
Date: Tue Jan 26 11:29:03 2016 +0100
rfkill: fix rfkill_fop_read wait_event usage
commit 6736fde9672ff6717ac576e9bba2fd5f3dfec822 upstream.
The code within wait_event_interruptible() is called with
!TASK_RUNNING, so mustn't call any functions that can sleep,
like mutex_lock().
Since we re-check the list_empty() in a loop after the wait,
it's safe to simply use list_empty() without locking.
This bug has existed forever, but was only discovered now
because all userspace implementations, including the default
'rfkill' tool, use poll() or select() to get a readable fd
before attempting to read.
Fixes: c64fb01627e24 ("rfkill: create useful userspace interface")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb80decbf08882c0dc6573d329fd9b5c1aff3c31
Author: Oliver Neukum <oneukum@suse.com>
Date: Mon Jan 18 15:45:18 2016 +0100
cdc-acm:exclude Samsung phone 04e8:685d
commit e912e685f372ab62a2405a1acd923597f524e94a upstream.
This phone needs to be handled by a specialised firmware tool
and is reported to crash irrevocably if cdc-acm takes it.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 636a9c8a87da5056b4254ff9eaf67cf52c8c2d1d
Author: Ilya Dryomov <idryomov@gmail.com>
Date: Wed Feb 17 20:04:08 2016 +0100
libceph: don't bail early from try_read() when skipping a message
commit e7a88e82fe380459b864e05b372638aeacb0f52d upstream.
The contract between try_read() and try_write() is that when called
each processes as much data as possible. When instructed by osd_client
to skip a message, try_read() is violating this contract by returning
after receiving and discarding a single message instead of checking for
more. try_write() then gets a chance to write out more requests,
generating more replies/skips for try_read() to handle, forcing the
messenger into a starvation loop.
Reported-by: Varada Kari <Varada.Kari@sandisk.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Varada Kari <Varada.Kari@sandisk.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6c92a436f3e930f7d7d8d456cf3ac36602039cf
Author: Mike Marciniszyn <mike.marciniszyn@intel.com>
Date: Thu Jan 7 16:44:10 2016 -0500
IB/qib: fix mcast detach when qp not attached
commit 09dc9cd6528f5b52bcbd3292a6312e762c85260f upstream.
The code produces the following trace:
[1750924.419007] general protection fault: 0000 [#3] SMP
[1750924.420364] Modules linked in: nfnetlink autofs4 rpcsec_gss_krb5 nfsv4
dcdbas rfcomm bnep bluetooth nfsd auth_rpcgss nfs_acl dm_multipath nfs lockd
scsi_dh sunrpc fscache radeon ttm drm_kms_helper drm serio_raw parport_pc
ppdev i2c_algo_bit lpc_ich ipmi_si ib_mthca ib_qib dca lp parport ib_ipoib
mac_hid ib_cm i3000_edac ib_sa ib_uverbs edac_core ib_umad ib_mad ib_core
ib_addr tg3 ptp dm_mirror dm_region_hash dm_log psmouse pps_core
[1750924.420364] CPU: 1 PID: 8401 Comm: python Tainted: G D
3.13.0-39-generic #66-Ubuntu
[1750924.420364] Hardware name: Dell Computer Corporation PowerEdge
860/0XM089, BIOS A04 07/24/2007
[1750924.420364] task: ffff8800366a9800 ti: ffff88007af1c000 task.ti:
ffff88007af1c000
[1750924.420364] RIP: 0010:[<ffffffffa0131d51>] [<ffffffffa0131d51>]
qib_mcast_qp_free+0x11/0x50 [ib_qib]
[1750924.420364] RSP: 0018:ffff88007af1dd70 EFLAGS: 00010246
[1750924.420364] RAX: 0000000000000001 RBX: ffff88007b822688 RCX:
000000000000000f
[1750924.420364] RDX: ffff88007b822688 RSI: ffff8800366c15a0 RDI:
6764697200000000
[1750924.420364] RBP: ffff88007af1dd78 R08: 0000000000000001 R09:
0000000000000000
[1750924.420364] R10: 0000000000000011 R11: 0000000000000246 R12:
ffff88007baa1d98
[1750924.420364] R13: ffff88003ecab000 R14: ffff88007b822660 R15:
0000000000000000
[1750924.420364] FS: 00007ffff7fd8740(0000) GS:ffff88007fc80000(0000)
knlGS:0000000000000000
[1750924.420364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1750924.420364] CR2: 00007ffff597c750 CR3: 000000006860b000 CR4:
00000000000007e0
[1750924.420364] Stack:
[1750924.420364] ffff88007b822688 ffff88007af1ddf0 ffffffffa0132429
000000007af1de20
[1750924.420364] ffff88007baa1dc8 ffff88007baa0000 ffff88007af1de70
ffffffffa00cb313
[1750924.420364] 00007fffffffde88 0000000000000000 0000000000000008
ffff88003ecab000
[1750924.420364] Call Trace:
[1750924.420364] [<ffffffffa0132429>] qib_multicast_detach+0x1e9/0x350
[ib_qib]
[1750924.568035] [<ffffffffa00cb313>] ? ib_uverbs_modify_qp+0x323/0x3d0
[ib_uverbs]
[1750924.568035] [<ffffffffa0092d61>] ib_detach_mcast+0x31/0x50 [ib_core]
[1750924.568035] [<ffffffffa00cc213>] ib_uverbs_detach_mcast+0x93/0x170
[ib_uverbs]
[1750924.568035] [<ffffffffa00c61f6>] ib_uverbs_write+0xc6/0x2c0 [ib_uverbs]
[1750924.568035] [<ffffffff81312e68>] ? apparmor_file_permission+0x18/0x20
[1750924.568035] [<ffffffff812d4cd3>] ? security_file_permission+0x23/0xa0
[1750924.568035] [<ffffffff811bd214>] vfs_write+0xb4/0x1f0
[1750924.568035] [<ffffffff811bdc49>] SyS_write+0x49/0xa0
[1750924.568035] [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f
[1750924.568035] Code: 66 2e 0f 1f 84 00 00 00 00 00 31 c0 5d c3 66 2e 0f 1f
84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 8b 7f 10
<f0> ff 8f 40 01 00 00 74 0e 48 89 df e8 8e f8 06 e1 5b 5d c3 0f
[1750924.568035] RIP [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50
[ib_qib]
[1750924.568035] RSP <ffff88007af1dd70>
[1750924.650439] ---[ end trace 73d5d4b3f8ad4851 ]
The fix is to note the qib_mcast_qp that was found. If none is found, then
return EINVAL indicating the error.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a81bfc00a3f9eb778bad0d97551d91c332ec2f1b
Author: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Date: Mon Feb 15 19:41:47 2016 +0100
drm/radeon: use post-decrement in error handling
commit bc3f5d8c4ca01555820617eb3b6c0857e4df710d upstream.
We need to use post-decrement to get the pci_map_page undone also for
i==0, and to avoid some very unpleasant behaviour if pci_map_page
failed already at i==0.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
Author: Nicolai Hähnle <nicolai.haehnle@amd.com>
Date: Fri Feb 5 14:35:53 2016 -0500
drm/radeon: hold reference to fences in radeon_sa_bo_new
commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
An arbitrary amount of time can pass between spin_unlock and
radeon_fence_wait_any, so we need to ensure that nobody frees the
fences from under us.
Based on the analogous fix for amdgpu.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb071fb6ace47b8422edd02ee91116d4f31c2c92
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Thu Dec 17 12:52:17 2015 -0500
drm/radeon: clean up fujitsu quirks
commit 0eb1c3d4084eeb6fb3a703f88d6ce1521f8fcdd1 upstream.
Combine the two quirks.
bug:
https://bugzilla.kernel.org/show_bug.cgi?id=109481
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a1c393a2a324f2a1210a72fa60e5c9da03b1f1ce
Author: Rob Clark <robdclark@gmail.com>
Date: Wed Oct 15 15:00:47 2014 -0400
drm/vmwgfx: respect 'nomodeset'
commit 96c5d076f0a5e2023ecdb44d8261f87641ee71e0 upstream.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39e88dd4da3ddc4c07150fe75d9590a648d0eb0f
Author: Dmitry V. Levin <ldv@altlinux.org>
Date: Sun Dec 27 02:13:27 2015 +0300
sparc64: fix incorrect sign extension in sys_sparc64_personality
commit 525fd5a94e1be0776fa652df5c687697db508c91 upstream.
The value returned by sys_personality has type "long int".
It is saved to a variable of type "int", which is not a problem
yet because the type of task_struct->pesonality is "unsigned int".
The problem is the sign extension from "int" to "long int"
that happens on return from sys_sparc64_personality.
For example, a userspace call personality((unsigned) -EINVAL) will
result to any subsequent personality call, including absolutely
harmless read-only personality(0xffffffff) call, failing with
errno set to EINVAL.
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf64271c53bfc54deeebf7ffa69d2df03ae58cf5
Author: Linus Walleij <linus.walleij@linaro.org>
Date: Mon Jan 4 02:21:55 2016 +0100
mmc: mmci: fix an ages old detection error
commit 0bcb7efdff63564e80fe84dd36a9fbdfbf6697a4 upstream.
commit 4956e10903fd ("ARM: 6244/1: mmci: add variant data and default
MCICLOCK support") added variant data for ARM, U300 and Ux500 variants.
The Nomadik NHK8815/8820 variant was erroneously labeled as a U300
variant, and when the proper Nomadik variant was later introduced in
commit 34fd421349ff ("ARM: 7378/1: mmci: add support for the Nomadik MMCI
variant") this was not fixes. Let's say this fixes the latter commit as
there was no proper Nomadik support until then.
Fixes: 34fd421349ff ("ARM: 7378/1: mmci: add support for the Nomadik...")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 794c33bd215bbd1f7a3513f9c70a8e0afcbfcd7a
Author: Richard Cochran <richardcochran@gmail.com>
Date: Tue Dec 22 22:19:58 2015 +0100
posix-clock: Fix return code on the poll method's error path
commit 1b9f23727abb92c5e58f139e7d180befcaa06fe0 upstream.
The posix_clock_poll function is supposed to return a bit mask of
POLLxxx values. However, in case the hardware has disappeared (due to
hot plugging for example) this code returns -ENODEV in a futile
attempt to throw an error at the file descriptor level. The kernel's
file_operations interface does not accept such error codes from the
poll method. Instead, this function aught to return POLLERR.
The value -ENODEV does, in fact, contain the POLLERR bit (and almost
all the other POLLxxx bits as well), but only by chance. This patch
fixes code to return a proper bit mask.
Credit goes to Markus Elfring for pointing out the suspicious
signed/unsigned mismatch.
Reported-by: Markus Elfring <elfring@users.sourceforge.net>
igned-off-by: Richard Cochran <richardcochran@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Link: http://lkml.kernel.org/r/1450819198-17420-1-git-send-email-richardcochran@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eeb7b0e01658684a743ccfd66a668e8a56d5ebb9
Author: Mikulas Patocka <mpatocka@redhat.com>
Date: Fri Jan 8 19:07:55 2016 -0500
dm snapshot: fix hung bios when copy error occurs
commit 385277bfb57faac44e92497104ba542cdd82d5fe upstream.
When there is an error copying a chunk dm-snapshot can incorrectly hold
associated bios indefinitely, resulting in hung IO.
The function copy_callback sets pe->error if there was error copying the
chunk, and then calls complete_exception. complete_exception calls
pending_complete on error, otherwise it calls commit_exception with
commit_callback (and commit_callback calls complete_exception).
The persistent exception store (dm-snap-persistent.c) assumes that calls
to prepare_exception and commit_exception are paired.
persistent_prepare_exception increases ps->pending_count and
persistent_commit_exception decreases it.
If there is a copy error, persistent_prepare_exception is called but
persistent_commit_exception is not. This results in the variable
ps->pending_count never returning to zero and that causes some pending
exceptions (and their associated bios) to be held forever.
Fix this by unconditionally calling commit_exception regardless of
whether the copy was successful. A new "valid" parameter is added to
commit_exception -- when the copy fails this parameter is set to zero so
that the chunk that failed to copy (and all following chunks) is not
recorded in the snapshot store. Also, remove commit_callback now that
it is merely a wrapper around pending_complete.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cd529968dd0647bf75e24f4e36fef99fc536b58
Author: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Date: Wed Feb 3 17:33:48 2016 -0200
tda1004x: only update the frontend properties if locked
commit e8beb02343e7582980c6705816cd957cf4f74c7a upstream.
The tda1004x was updating the properties cache before locking.
If the device is not locked, the data at the registers are just
random values with no real meaning.
This caused the driver to fail with libdvbv5, as such library
calls GET_PROPERTY from time to time, in order to return the
DVB stats.
Tested with a saa7134 card 78:
ASUSTeK P7131 Dual, vendor PCI ID: 1043:4862
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 590d6fd45445a32e6efecd8158a96e8dd09f8281
Author: Antonio Ospite <ao2@ao2.it>
Date: Fri Oct 2 17:33:13 2015 -0300
gspca: ov534/topro: prevent a division by 0
commit dcc7fdbec53a960588f2c40232db2c6466c09917 upstream.
v4l2-compliance sends a zeroed struct v4l2_streamparm in
v4l2-test-formats.cpp::testParmType(), and this results in a division by
0 in some gspca subdrivers:
divide error: 0000 [#1] SMP
Modules linked in: gspca_ov534 gspca_main ...
CPU: 0 PID: 17201 Comm: v4l2-compliance Not tainted 4.3.0-rc2-ao2 #1
Hardware name: System manufacturer System Product Name/M2N-E SLI, BIOS
ASUS M2N-E SLI ACPI BIOS Revision 1301 09/16/2010
task: ffff8800818306c0 ti: ffff880095c4c000 task.ti: ffff880095c4c000
RIP: 0010:[<ffffffffa079bd62>] [<ffffffffa079bd62>] sd_set_streamparm+0x12/0x60 [gspca_ov534]
RSP: 0018:ffff880095c4fce8 EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffff8800c9522000 RCX: ffffffffa077a140
RDX: 0000000000000000 RSI: ffff880095e0c100 RDI: ffff8800c9522000
RBP: ffff880095e0c100 R08: ffffffffa077a100 R09: 00000000000000cc
R10: ffff880067ec7740 R11: 0000000000000016 R12: ffffffffa07bb400
R13: 0000000000000000 R14: ffff880081b6a800 R15: 0000000000000000
FS: 00007fda0de78740(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000014630f8 CR3: 00000000cf349000 CR4: 00000000000006f0
Stack:
ffffffffa07a6431 ffff8800c9522000 ffffffffa077656e 00000000c0cc5616
ffff8800c9522000 ffffffffa07a5e20 ffff880095e0c100 0000000000000000
ffff880067ec7740 ffffffffa077a140 ffff880067ec7740 0000000000000016
Call Trace:
[<ffffffffa07a6431>] ? v4l_s_parm+0x21/0x50 [videodev]
[<ffffffffa077656e>] ? vidioc_s_parm+0x4e/0x60 [gspca_main]
[<ffffffffa07a5e20>] ? __video_do_ioctl+0x280/0x2f0 [videodev]
[<ffffffffa07a5ba0>] ? video_ioctl2+0x20/0x20 [videodev]
[<ffffffffa07a59b9>] ? video_usercopy+0x319/0x4e0 [videodev]
[<ffffffff81182dc1>] ? page_add_new_anon_rmap+0x71/0xa0
[<ffffffff811afb92>] ? mem_cgroup_commit_charge+0x52/0x90
[<ffffffff81179b18>] ? handle_mm_fault+0xc18/0x1680
[<ffffffffa07a15cc>] ? v4l2_ioctl+0xac/0xd0 [videodev]
[<ffffffff811c846f>] ? do_vfs_ioctl+0x28f/0x480
[<ffffffff811c86d4>] ? SyS_ioctl+0x74/0x80
[<ffffffff8154a8b6>] ? entry_SYSCALL_64_fastpath+0x16/0x75
Code: c7 93 d9 79 a0 5b 5d e9 f1 f3 9a e0 0f 1f 00 66 2e 0f 1f 84 00
00 00 00 00 66 66 66 66 90 53 31 d2 48 89 fb 48 83 ec 08 8b 46 10 <f7>
76 0c 80 bf ac 0c 00 00 00 88 87 4e 0e 00 00 74 09 80 bf 4f
RIP [<ffffffffa079bd62>] sd_set_streamparm+0x12/0x60 [gspca_ov534]
RSP <ffff880095c4fce8>
---[ end trace 279710c2c6c72080 ]---
Following what the doc says about a zeroed timeperframe (see
http://www.linuxtv.org/downloads/v4l-dvb-apis/vidioc-g-parm.html):
...
To reset manually applications can just set this field to zero.
fix the issue by resetting the frame rate to a default value in case of
an unusable timeperframe.
The fix is done in the subdrivers instead of gspca.c because only the
subdrivers have notion of a default frame rate to reset the camera to.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b29bcfbae9971d931fd65f6ae075d394907517b
Author: Malcolm Priestley <tvboxspy@gmail.com>
Date: Mon Aug 31 06:13:45 2015 -0300
media: dvb-core: Don't force CAN_INVERSION_AUTO in oneshot mode
commit c9d57de6103e343f2d4e04ea8d9e417e10a24da7 upstream.
When in FE_TUNE_MODE_ONESHOT the frontend must report
the actual capabilities so user can take appropriate
action.
With frontends that can't do auto inversion this is done
by dvb-core automatically so CAN_INVERSION_AUTO is valid.
However, when in FE_TUNE_MODE_ONESHOT this is not true.
So only set FE_CAN_INVERSION_AUTO in modes other than
FE_TUNE_MODE_ONESHOT
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c77a51256e0b05b0948c1f5dd06dfb2b5abe489
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date: Wed Dec 16 21:59:56 2015 +0100
uml: fix hostfs mknod()
commit 9f2dfda2f2f1c6181c3732c16b85c59ab2d195e0 upstream.
An inverted return value check in hostfs_mknod() caused the function
to return success after handling it as an error (and cleaning up).
It resulted in the following segfault when trying to bind() a named
unix socket:
Pid: 198, comm: a.out Not tainted 4.4.0-rc4
RIP: 0033:[<0000000061077df6>]
RSP: 00000000daae5d60 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 000000006092a460 RCX: 00000000dfc54208
RDX: 0000000061073ef1 RSI: 0000000000000070 RDI: 00000000e027d600
RBP: 00000000daae5de0 R08: 00000000da980ac0 R09: 0000000000000000
R10: 0000000000000003 R11: 00007fb1ae08f72a R12: 0000000000000000
R13: 000000006092a460 R14: 00000000daaa97c0 R15: 00000000daaa9a88
Kernel panic - not syncing: Kernel mode fault at addr 0x40, ip 0x61077df6
CPU: 0 PID: 198 Comm: a.out Not tainted 4.4.0-rc4 #1
Stack:
e027d620 dfc54208 0000006f da981398
61bee000 0000c1ed daae5de0 0000006e
e027d620 dfcd4208 00000005 6092a460
Call Trace:
[<60dedc67>] SyS_bind+0xf7/0x110
[<600587be>] handle_syscall+0x7e/0x80
[<60066ad7>] userspace+0x3e7/0x4e0
[<6006321f>] ? save_registers+0x1f/0x40
[<6006c88e>] ? arch_prctl+0x1be/0x1f0
[<60054985>] fork_handler+0x85/0x90
Let's also get rid of the "cosmic ray protection" while we're at it.
Fixes: e9193059b1b3 "hostfs: fix races in dentry_name() and inode_name()"
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cbb43b99bf138e44deef9957678bc464f3bfd82
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date: Fri Dec 18 21:28:53 2015 +0100
uml: flush stdout before forking
commit 0754fb298f2f2719f0393491d010d46cfb25d043 upstream.
I was seeing some really weird behaviour where piping UML's output
somewhere would cause output to get duplicated:
$ ./vmlinux | head -n 40
Checking that ptrace can change system call numbers...Core dump limits :
soft - 0
hard - NONE
OK
Checking syscall emulation patch for ptrace...Core dump limits :
soft - 0
hard - NONE
OK
Checking advanced syscall emulation patch for ptrace...Core dump limits :
soft - 0
hard - NONE
OK
Core dump limits :
soft - 0
hard - NONE
This is because these tests do a fork() which duplicates the non-empty
stdout buffer, then glibc flushes the duplicated buffer as each child
exits.
A simple workaround is to flush before forking.
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 509a4bccfecaaf92af4bc27b9b212fa4a24e756c
Author: Stefan Haberland <stefan.haberland@de.ibm.com>
Date: Tue Dec 15 10:45:05 2015 +0100
s390/dasd: fix refcount for PAV reassignment
commit 9d862ababb609439c5d6987f6d3ddd09e703aa0b upstream.
Add refcount to the DASD device when a summary unit check worker is
scheduled. This prevents that the device is set offline with worker
in place.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9155d3f258f83643bad27297e7539a67bb81122c
Author: Stefan Haberland <stefan.haberland@de.ibm.com>
Date: Tue Dec 15 10:16:43 2015 +0100
s390/dasd: prevent incorrect length error under z/VM after PAV changes
commit 020bf042e5b397479c1174081b935d0ff15d1a64 upstream.
The channel checks the specified length and the provided amount of
data for CCWs and provides an incorrect length error if the size does
not match. Under z/VM with simulation activated the length may get
changed. Having the suppress length indication bit set is stated as
good CCW coding practice and avoids errors under z/VM.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 396a61bef1418705af82ab7b5d1e1a193a699dd2
Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Date: Fri Jan 1 13:39:22 2016 +0100
s390: fix normalization bug in exception table sorting
commit bcb7825a77f41c7dd91da6f7ac10b928156a322e upstream.
The normalization pass in the sorting routine of the relative exception
table serves two purposes:
- it ensures that the address fields of the exception table entries are
fully ordered, so that no ambiguities arise between entries with
identical instruction offsets (i.e., when two instructions that are
exactly 8 bytes apart each have an exception table entry associated with
them)
- it ensures that the offsets of both the instruction and the fixup fields
of each entry are relative to their final location after sorting.
Commit eb608fb366de ("s390/exceptions: switch to relative exception table
entries") ported the relative exception table format from x86, but modified
the sorting routine to only normalize the instruction offset field and not
the fixup offset field. The result is that the fixup offset of each entry
will be relative to the original location of the entry before sorting,
likely leading to crashes when those entries are dereferenced.
Fixes: eb608fb366de ("s390/exceptions: switch to relative exception table entries")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dbf7fab09e9e0612a23ec7fee9570663b21fdfc9
Author: Filipe Manana <fdmanana@suse.com>
Date: Thu Dec 31 18:16:29 2015 +0000
Btrfs: fix number of transaction units required to create symlink
commit 9269d12b2d57d9e3d13036bb750762d1110d425c upstream.
We weren't accounting for the insertion of an inline extent item for the
symlink inode nor that we need to update the parent inode item (through
the call to btrfs_add_nondir()). So fix this by including two more
transaction units.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f81573daa963299aaf5692de7dfc39185f8a96d
Author: Filipe Manana <fdmanana@suse.com>
Date: Thu Dec 31 18:07:59 2015 +0000
Btrfs: send, don't BUG_ON() when an empty symlink is found
commit a879719b8c90e15c9e7fa7266d5e3c0ca962f9df upstream.
When a symlink is successfully created it always has an inline extent
containing the source path. However if an error happens when creating
the symlink, we can leave in the subvolume's tree a symlink inode without
any such inline extent item - this happens if after btrfs_symlink() calls
btrfs_end_transaction() and before it calls the inode eviction handler
(through the final iput() call), the transaction gets committed and a
crash happens before the eviction handler gets called, or if a snapshot
of the subvolume is made before the eviction handler gets called. Sadly
we can't just avoid this by making btrfs_symlink() call
btrfs_end_transaction() after it calls the eviction handler, because the
later can commit the current transaction before it removes any items from
the subvolume tree (if it encounters ENOSPC errors while reserving space
for removing all the items).
So make send fail more gracefully, with an -EIO error, and print a
message to dmesg/syslog informing that there's an empty symlink inode,
so that the user can delete the empty symlink or do something else
about it.
Reported-by: Stephen R. van den Berg <srb@cuci.nl>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6ecb7ece33b2da65dffa50ce13f28fec392ec54
Author: Josef Bacik <jbacik@fb.com>
Date: Thu Oct 22 15:05:09 2015 -0400
Btrfs: igrab inode in writepage
commit be7bd730841e69fe8f70120098596f648cd1f3ff upstream.
We hit this panic on a few of our boxes this week where we have an
ordered_extent with an NULL inode. We do an igrab() of the inode in writepages,
but weren't doing it in writepage which can be called directly from the VM on
dirty pages. If the inode has been unlinked then we could have I_FREEING set
which means igrab() would return NULL and we get this panic. Fix this by trying
to igrab in btrfs_writepage, and if it returns NULL then just redirty the page
and return AOP_WRITEPAGE_ACTIVATE; so the VM knows it wasn't successful. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf00d124e07fda629e60709c6187437d66848a9d
Author: Anand Jain <anand.jain@oracle.com>
Date: Wed Oct 7 17:23:23 2015 +0800
Btrfs: add missing brelse when superblock checksum fails
commit b2acdddfad13c38a1e8b927d83c3cf321f63601a upstream.
Looks like oversight, call brelse() when checksum fails. Further down the
code, in the non error path, we do call brelse() and so we don't see
brelse() in the goto error paths.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 099e9d4f2c57fba73d62db6764e09644636cbbc2
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date: Fri Dec 11 12:09:03 2015 +0000
scripts: recordmcount: break hardlinks
commit dd39a26538e37f6c6131e829a4a510787e43c783 upstream.
recordmcount edits the file in-place, which can cause problems when
using ccache in hardlink mode. Arrange for recordmcount to break a
hardlinked object.
Link: http://lkml.kernel.org/r/E1a7MVT-0000et-62@rmk-PC.arm.linux.org.uk
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f80e6add955c84db83cc5a230c967635f0a808b9
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: Fri Dec 11 09:16:38 2015 -0800
ses: fix additional element traversal bug
commit 5e1033561da1152c57b97ee84371dba2b3d64c25 upstream.
KASAN found that our additional element processing scripts drop off
the end of the VPD page into unallocated space. The reason is that
not every element has additional information but our traversal
routines think they do, leading to them expecting far more additional
information than is present. Fix this by adding a gate to the
traversal routine so that it only processes elements that are expected
to have additional information (list is in SES-2 section 6.1.13.1:
Additional Element Status diagnostic page overview)
Reported-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Tested-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8569305e453645f1227a627ec1ced1c68291d63
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: Tue Dec 8 09:00:31 2015 -0800
ses: Fix problems with simple enclosures
commit 3417c1b5cb1fdc10261dbed42b05cc93166a78fd upstream.
Simple enclosure implementations (mostly USB) are allowed to return only
page 8 to every diagnostic query. That really confuses our
implementation because we assume the return is the page we asked for and
end up doing incorrect offsets based on bogus information leading to
accesses outside of allocated ranges. Fix that by checking the page
code of the return and giving an error if it isn't the one we asked for.
This should fix reported bugs with USB storage by simply refusing to
attach to enclosures that behave like this. It's also good defensive
practise now that we're starting to see more USB enclosures.
Reported-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffb785e178acd0d965e4338c561f82bdb2d054b6
Author: Johannes Berg <johannes.berg@intel.com>
Date: Thu Dec 10 10:37:51 2015 +0100
rfkill: copy the name into the rfkill struct
commit b7bb110008607a915298bf0f47d25886ecb94477 upstream.
Some users of rfkill, like NFC and cfg80211, use a dynamic name when
allocating rfkill, in those cases dev_name(). Therefore, the pointer
passed to rfkill_alloc() might not be valid forever, I specifically
found the case that the rfkill name was quite obviously an invalid
pointer (or at least garbage) when the wiphy had been renamed.
Fix this by making a copy of the rfkill name in rfkill_alloc().
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cec78832326106ebd03bca2235a461aa6fac804
Author: Kirill A. Shutemov <kirill@shutemov.name>
Date: Mon Nov 30 04:17:31 2015 +0200
vgaarb: fix signal handling in vga_get()
commit 9f5bd30818c42c6c36a51f93b4df75a2ea2bd85e upstream.
There are few defects in vga_get() related to signal hadning:
- we shouldn't check for pending signals for TASK_UNINTERRUPTIBLE
case;
- if we found pending signal we must remove ourself from wait queue
and change task state back to running;
- -ERESTARTSYS is more appropriate, I guess.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 859ac05006374e785820017deba204859bbdf1ad
Author: Joe Thornber <ejt@redhat.com>
Date: Thu Dec 10 14:37:53 2015 +0000
dm btree: fix bufio buffer leaks in dm_btree_del() error path
commit ed8b45a3679eb49069b094c0711b30833f27c734 upstream.
If dm_btree_del()'s call to push_frame() fails, e.g. due to
btree_node_validator finding invalid metadata, the dm_btree_del() error
path must unlock all frames (which have active dm-bufio buffers) that
were pushed onto the del_stack.
Otherwise, dm_bufio_client_destroy() will BUG_ON() because dm-bufio
buffers have leaked, e.g.:
device-mapper: bufio: leaked buffer 3, hold count 1, list 0
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6eab6f74f15e0f38d875d523f74e6bc55492ba7
Author: Mikulas Patocka <mpatocka@redhat.com>
Date: Thu Nov 26 12:00:59 2015 -0500
sata_sil: disable trim
commit d98f1cd0a3b70ea91f1dfda3ac36c3b2e1a4d5e2 upstream.
When I connect an Intel SSD to SATA SIL controller (PCI ID 1095:3114), any
TRIM command results in I/O errors being reported in the log. There is
other similar error reported with TRIM and the SIL controller:
https://bugs.centos.org/view.php?id=5880
Apparently the controller doesn't support TRIM commands. This patch
disables TRIM support on the SATA SIL controller.
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: BMDMA2 stat 0x50001
ata7.00: failed command: DATA SET MANAGEMENT
ata7.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 0 dma 512 out
res 51/04:01:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ABRT }
ata7.00: device reported invalid CHS sector 0
sd 8:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 8:0:0:0: [sdb] tag#0 Sense Key : Illegal Request [current] [descriptor]
sd 8:0:0:0: [sdb] tag#0 Add. Sense: Unaligned write command
sd 8:0:0:0: [sdb] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 21 95 88 00 20 00 00 00 00
blk_update_request: I/O error, dev sdb, sector 2200968
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bf97b05008739ad7d644758301a427460479537
Author: Sasha Levin <sasha.levin@oracle.com>
Date: Mon Nov 30 20:34:20 2015 -0500
sched/core: Remove false-positive warning from wake_up_process()
commit 119d6f6a3be8b424b200dcee56e74484d5445f7e upstream.
Because wakeups can (fundamentally) be late, a task might not be in
the expected state. Therefore testing against a task's state is racy,
and can yield false positives.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: oleg@redhat.com
Fixes: 9067ac85d533 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task")
Link: http://lkml.kernel.org/r/1448933660-23082-1-git-send-email-sasha.levin@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4af5f2490fb0cab9e29f8b45b81cd69aa4d9babe
Author: Mirza Krak <mirza.krak@hostmobility.com>
Date: Tue Nov 10 14:59:34 2015 +0100
can: sja1000: clear interrupts on start
commit 7cecd9ab80f43972c056dc068338f7bcc407b71c upstream.
According to SJA1000 data sheet error-warning (EI) interrupt is not
cleared by setting the controller in to reset-mode.
Then if we have the following case:
- system is suspended (echo mem > /sys/power/state) and SJA1000 is left
in operating state
- A bus error condition occurs which activates EI interrupt, system is
still suspended which means EI interrupt will be not be handled nor
cleared.
If the above two events occur, on resume there is no way to return the
SJA1000 to operating state, except to cycle power to it.
By simply reading the IR register on start we will clear any previous
conditions that could be present.
Signed-off-by: Mirza Krak <mirza.krak@hostmobility.com>
Reported-by: Christian Magnusson <Christian.Magnusson@semcon.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2f3d50558b8e5deeaacf4990e478fc844444de5
Author: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Date: Tue Nov 24 17:13:21 2015 -0500
RDS: fix race condition when sending a message on unbound socket
commit 8c7188b23474cca017b3ef354c4a58456f68303a upstream.
Sasha's found a NULL pointer dereference in the RDS connection code when
sending a message to an apparently unbound socket. The problem is caused
by the code checking if the socket is bound in rds_sendmsg(), which checks
the rs_bound_addr field without taking a lock on the socket. This opens a
race where rs_bound_addr is temporarily set but where the transport is not
in rds_bind(), leading to a NULL pointer dereference when trying to
dereference 'trans' in __rds_conn_create().
Vegard wrote a reproducer for this issue, so kindly ask him to share if
you're interested.
I cannot reproduce the NULL pointer dereference using Vegard's reproducer
with this patch, whereas I could without.
Complete earlier incomplete fix to CVE-2015-6937:
74e98eb08588 ("RDS: verify the underlying transport exists before creating a connection")
Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com>
Reviewed-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1467bec9a74c7369fbf38c9fbd51f1f61a3f14c7
Author: Johannes Berg <johannes.berg@intel.com>
Date: Tue Nov 17 14:25:21 2015 +0100
mac80211: mesh: fix call_rcu() usage
commit c2e703a55245bfff3db53b1f7cbe59f1ee8a4339 upstream.
When using call_rcu(), the called function may be delayed quite
significantly, and without a matching rcu_barrier() there's no
way to be sure it has finished.
Therefore, global state that could be gone/freed/reused should
never be touched in the callback.
Fix this in mesh by moving the atomic_dec() into the caller;
that's not really a problem since we already unlinked the path
and it will be destroyed anyway.
This fixes a crash Jouni observed when running certain tests in
a certain order, in which the mesh interface was torn down, the
memory reused for a function pointer (work struct) and running
that then crashed since the pointer had been decremented by 1,
resulting in an invalid instruction byte stream.
Fixes: eb2b9311fd00 ("mac80211: mesh path table implementation")
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 963e16256e30f627f5c105814a8d9658f2107b7e
Author: Suman Anna <s-anna@ti.com>
Date: Wed Sep 16 19:29:17 2015 -0500
virtio: fix memory leak of virtio ida cache layers
commit c13f99b7e945dad5273a8b7ee230f4d1f22d3354 upstream.
The virtio core uses a static ida named virtio_index_ida for
assigning index numbers to virtio devices during registration.
The ida core may allocate some internal idr cache layers and
an ida bitmap upon any ida allocation, and all these layers are
truely freed only upon the ida destruction. The virtio_index_ida
is not destroyed at present, leading to a memory leak when using
the virtio core as a module and atleast one virtio device is
registered and unregistered.
Fix this by invoking ida_destroy() in the virtio core module
exit.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6039f028a9bbf7cb34ecfac31d5a9df68453221d
Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Date: Mon Nov 23 10:35:36 2015 -0500
ring-buffer: Update read stamp with first real commit on page
commit b81f472a208d3e2b4392faa6d17037a89442f4ce upstream.
Do not update the read stamp after swapping out the reader page from the
write buffer. If the reader page is swapped out of the buffer before an
event is written to it, then the read_stamp may get an out of date
timestamp, as the page timestamp is updated on the first commit to that
page.
rb_get_reader_page() only returns a page if it has an event on it, otherwise
it will return NULL. At that point, check if the page being returned has
events and has not been read yet. Then at that point update the read_stamp
to match the time stamp of the reader page.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b63a96fada140801597b62a2a6a6818a96ae39e9
Author: Jan Kara <jack@suse.cz>
Date: Mon Nov 23 13:09:51 2015 +0100
vfs: Avoid softlockups with sendfile(2)
commit c2489e07c0a71a56fb2c84bc0ee66cddfca7d068 upstream.
The following test program from Dmitry can cause softlockups or RCU
stalls as it copies 1GB from tmpfs into eventfd and we don't have any
scheduling point at that path in sendfile(2) implementation:
int r1 = eventfd(0, 0);
int r2 = memfd_create("", 0);
unsigned long n = 1<<30;
fallocate(r2, 0, 0, n);
sendfile(r1, r2, 0, n);
Add cond_resched() into __splice_from_pipe() to fix the problem.
CC: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 677eea664a18f46ec7186d1b0a1595f8ac2cb959
Author: Vineet Gupta <vgupta@synopsys.com>
Date: Mon Nov 23 19:32:51 2015 +0530
ARC: dw2 unwind: Remove falllback linear search thru FDE entries
commit 2e22502c080f27afeab5e6f11e618fb7bc7aea53 upstream.
Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"
| perf record -g -c 15000 -e cycles /sbin/hackbench
|
| INFO: rcu_preempt self-detected stall on CPU
| 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
| Task dump for CPU 1:
in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
search (which iterates thru each of ~11K entries) thus takes 2 orders of
magnitude longer (~3 million cycles vs. 2000). Routines written in hand
assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
yet) fail the unwinder binary lookup, hit linear search, failing
nevertheless in the end.
However the linear search is pointless as binary lookup tables are created
from it in first place. It is impossible to have binary lookup fail while
succeed the linear search. It is pure waste of cycles thus removed by
this patch.
This manifested as RCU stalls / NMI watchdog splat when running
hackbench under perf with callgraph profiling. The triggering condition
was perf counter overflowing in routine lacking dwarf info (like memset)
leading to patheic 3 million cycle unwinder slow path and by the time it
returned new interrupts were already pending (Timer, IPI) and taken
rightaway. The original memset didn't make forward progress, system kept
accruing more interrupts and more unwinder delayes in a vicious feedback
loop, ultimately triggering the NMI diagnostic.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a27f61bd411e564eb4651c18d225f6e9e1de534
Author: Kees Cook <keescook@chromium.org>
Date: Thu Nov 19 17:18:54 2015 -0800
mac: validate mac_partition is within sector
commit 02e2a5bfebe99edcf9d694575a75032d53fe1b73 upstream.
If md->signature == MAC_DRIVER_MAGIC and md->block_size == 1023, a single
512 byte sector would be read (secsize / 512). However the partition
structure would be located past the end of the buffer (secsize % 512).
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3dda035201bd6299d95414c7d9943f721b7be87
Author: Luca Porzio <lporzio@micron.com>
Date: Fri Nov 6 15:12:26 2015 +0000
mmc: remove bondage between REQ_META and reliable write
commit d3df0465db00cf4ed9f90d0bfc3b827d32b9c796 upstream.
Anytime a write operation is performed with Reliable Write flag enabled,
the eMMC device is enforced to bypass the cache and do a write to the
underling NVM device by Jedec specification; this causes a performance
penalty since write operations can't be optimized by the device cache.
In our tests, we replayed a typical mobile daily trace pattern and found
~9% overall time reduction in trace replay by using this patch. Also the
write ops within 4KB~64KB chunk size range get a 40~60% performance
improvement by using the patch (as this range of write chunks are the ones
affected by REQ_META).
This patch has been discussed in the Mobile & Embedded Linux Storage Forum
and it's the results of feedbacks from many people. We also checked with
fsdevl and f2fs mailing list developers that this change in the usage of
REQ_META is not affecting FS behavior and we got positive feedbacks.
Reporting here the feedbacks:
http://comments.gmane.org/gmane.linux.file-systems/97219
http://thread.gmane.org/gmane.linux.file-systems.f2fs/3178/focus=3183
Signed-off-by: Bruce Ford <bford@micron.com>
Signed-off-by: Luca Porzio <lporzio@micron.com>
Fixes: ce39f9d17c14 ("mmc: support packed write command for eMMC4.5 devices")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eef125560fd129761cd4842ae061d543e81b533a
Author: sumit.saxena@avagotech.com <sumit.saxena@avagotech.com>
Date: Thu Oct 15 13:40:54 2015 +0530
megaraid_sas : SMAP restriction--do not access user memory from IOCTL code
commit 323c4a02c631d00851d8edc4213c4d184ef83647 upstream.
This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit
OS. Do not access user memory from kernel code. The SMAP bit restricts
accessing user memory from kernel code.
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64896131a6e3c60133360b2ff2a70487eb35f721
Author: sumit.saxena@avagotech.com <sumit.saxena@avagotech.com>
Date: Thu Oct 15 13:40:04 2015 +0530
megaraid_sas: Do not use PAGE_SIZE for max_sectors
commit 357ae967ad66e357f78b5cfb5ab6ca07fb4a7758 upstream.
Do not use PAGE_SIZE marco to calculate max_sectors per I/O
request. Driver code assumes PAGE_SIZE will be always 4096 which can
lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue
was reported in Ubuntu Bugzilla Bug #1475166.
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 621264c8898d4e0a5d14919ea900d82c2eab6262
Author: Valentin Rothberg <valentinrothberg@gmail.com>
Date: Tue Sep 22 19:00:40 2015 +0200
wm831x_power: Use IRQF_ONESHOT to request threaded IRQs
commit 90adf98d9530054b8e665ba5a928de4307231d84 upstream.
Since commit 1c6c69525b40 ("genirq: Reject bogus threaded irq requests")
threaded IRQs without a primary handler need to be requested with
IRQF_ONESHOT, otherwise the request will fail.
scripts/coccinelle/misc/irqf_oneshot.cocci detected this issue.
Fixes: b5874f33bbaf ("wm831x_power: Use genirq")
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 736652169c60e99df027befd07f9217bcaba840d
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date: Mon Sep 21 19:21:51 2015 +0300
devres: fix a for loop bounds check
commit 1f35d04a02a652f14566f875aef3a6f2af4cb77b upstream.
The iomap[] array has PCIM_IOMAP_MAX (6) elements and not
DEVICE_COUNT_RESOURCE (16). This bug was found using a static checker.
It may be that the "if (!(mask & (1 << i)))" check means we never
actually go past the end of the array in real life.
Fixes: ec04b075843d ('iomap: implement pcim_iounmap_regions()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db13625b0968e8ccaa14dfa0fcb2b347524be05e
Author: Andrey Ryabinin <aryabinin@virtuozzo.com>
Date: Wed Sep 23 15:49:29 2015 +0300
lockd: create NSM handles per net namespace
commit 0ad95472bf169a3501991f8f33f5147f792a8116 upstream.
Commit cb7323fffa85 ("lockd: create and use per-net NSM
RPC clients on MON/UNMON requests") introduced per-net
NSM RPC clients. Unfortunately this doesn't make any sense
without per-net nsm_handle.
E.g. the following scenario could happen
Two hosts (X and Y) in different namespaces (A and B) share
the same nsm struct.
1. nsm_monitor(host_X) called => NSM rpc client created,
nsm->sm_monitored bit set.
2. nsm_mointor(host-Y) called => nsm->sm_monitored already set,
we just exit. Thus in namespace B ln->nsm_clnt == NULL.
3. host X destroyed => nsm->sm_count decremented to 1
4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr
dereference of *ln->nsm_clnt
So this could be fixed by making per-net nsm_handles list,
instead of global. Thus different net namespaces will not be able
share the same nsm_handle.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1644fe6cc1567ecde034ea8acd5f4d6146e395b5
Author: Roman Volkov <rvolkov@v1ros.org>
Date: Fri Jan 1 16:24:41 2016 +0300
clocksource/drivers/vt8500: Increase the minimum delta
commit f9eccf24615672896dc13251410c3f2f33a14f95 upstream.
The vt8500 clocksource driver declares itself as capable to handle the
minimum delay of 4 cycles by passing the value into
clockevents_config_and_register(). The vt8500_timer_set_next_event()
requires the passed cycles value to be at least 16. The impact is that
userspace hangs in nanosleep() calls with small delay intervals.
This problem is reproducible in Linux 4.2 starting from:
c6eb3f70d448 ('hrtimer: Get rid of hrtimer softirq')
From Russell King, more detailed explanation:
"It's a speciality of the StrongARM/PXA hardware. It takes a certain
number of OSCR cycles for the value written to hit the compare registers.
So, if a very small delta is written (eg, the compare register is written
with a value of OSCR + 1), the OSCR will have incremented past this value
before it hits the underlying hardware. The result is, that you end up
waiting a very long time for the OSCR to wrap before the event fires.
So, we introduce a check in set_next_event() to detect this and return
-ETIME if the calculated delta is too small, which causes the generic
clockevents code to retry after adding the min_delta specified in
clockevents_config_and_register() to the current time value.
min_delta must be sufficient that we don't re-trip the -ETIME check - if
we do, we will return -ETIME, forward the next event time, try to set it,
return -ETIME again, and basically lock the system up. So, min_delta
must be larger than the check inside set_next_event(). A factor of two
was chosen to ensure that this situation would never occur.
The PXA code worked on PXA systems for years, and I'd suggest no one
changes this mechanism without access to a wide range of PXA systems,
otherwise they're risking breakage."
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Roman Volkov <rvolkov@v1ros.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf5cd0c632e49ca583cd3531b55693e615a2b332
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Sun Dec 13 18:12:30 2015 +0100
genirq: Prevent chip buslock deadlock
commit abc7e40c81d113ef4bacb556f0a77ca63ac81d85 upstream.
If a interrupt chip utilizes chip->buslock then free_irq() can
deadlock in the following way:
CPU0 CPU1
interrupt(X) (Shared or spurious)
free_irq(X) interrupt_thread(X)
chip_bus_lock(X)
irq_finalize_oneshot(X)
chip_bus_lock(X)
synchronize_irq(X)
synchronize_irq() waits for the interrupt thread to complete,
i.e. forever.
Solution is simple: Drop chip_bus_lock() before calling
synchronize_irq() as we do with the irq_desc lock. There is nothing to
be protected after the point where irq_desc lock has been released.
This adds chip_bus_lock/unlock() to the remove_irq() code path, but
that's actually correct in the case where remove_irq() is called on
such an interrupt. The current users of remove_irq() are not affected
as none of those interrupts is on a chip which requires buslock.
Reported-by: Fredrik Markström <fredrik.markstrom@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 341f09c01a7f26a030f3bedb08e4ce91e3ca24d3
Author: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed Feb 3 02:11:03 2016 +0100
unix: correctly track in-flight fds in sending process user_struct
commit 415e3d3e90ce9e18727e8843ae343eda5a58fad6 upstream.
The commit referenced in the Fixes tag incorrectly accounted the number
of in-flight fds over a unix domain socket to the original opener
of the file-descriptor. This allows another process to arbitrary
deplete the original file-openers resource limit for the maximum of
open files. Instead the sending processes and its struct cred should
be credited.
To do so, we add a reference counted struct user_struct pointer to the
scm_fp_list and use it to account for the number of inflight unix fds.
Fixes: 712f4aad406bb1 ("unix: properly account for FDs passed over unix sockets")
Reported-by: David Herrmann <dh.herrmann@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59ae7b1c13bd615b09bff9d03aaa335559af604a
Author: Olga Kornievskaia <aglo@umich.edu>
Date: Mon Sep 14 19:54:36 2015 -0400
Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount
commit a41cbe86df3afbc82311a1640e20858c0cd7e065 upstream.
A test case is as the description says:
open(foobar, O_WRONLY);
sleep() --> reboot the server
close(foobar)
The bug is because in nfs4state.c in nfs4_reclaim_open_state() a few
line before going to restart, there is
clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &state->flags).
NFS4CLNT_RECLAIM_NOGRACE is a flag for the client states not open
owner states. Value of NFS4CLNT_RECLAIM_NOGRACE is 4 which is the
value of NFS_O_WRONLY_STATE in nfs4_state->flags. So clearing it wipes
out state and when we go to close it, “call_close” doesn’t get set as
state flag is not set and CLOSE doesn’t go on the wire.
Signed-off-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c67196fd14f5f53eb865719bfca1908fa963618
Author: Christophe Leroy <christophe.leroy@c-s.fr>
Date: Wed May 6 17:26:47 2015 +0200
splice: sendfile() at once fails for big files
commit 0ff28d9f4674d781e492bcff6f32f0fe48cf0fed upstream.
Using sendfile with below small program to get MD5 sums of some files,
it appear that big files (over 64kbytes with 4k pages system) get a
wrong MD5 sum while small files get the correct sum.
This program uses sendfile() to send a file to an AF_ALG socket
for hashing.
/* md5sum2.c */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <linux/if_alg.h>
int main(int argc, char **argv)
{
int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
struct stat st;
struct sockaddr_alg sa = {
.salg_family = AF_ALG,
.salg_type = "hash",
.salg_name = "md5",
};
int n;
bind(sk, (struct sockaddr*)&sa, sizeof(sa));
for (n = 1; n < argc; n++) {
int size;
int offset = 0;
char buf[4096];
int fd;
int sko;
int i;
fd = open(argv[n], O_RDONLY);
sko = accept(sk, NULL, 0);
fstat(fd, &st);
size = st.st_size;
sendfile(sko, fd, &offset, size);
size = read(sko, buf, sizeof(buf));
for (i = 0; i < size; i++)
printf("%2.2x", buf[i]);
printf(" %s\n", argv[n]);
close(fd);
close(sko);
}
exit(0);
}
Test below is done using official linux patch files. First result is
with a software based md5sum. Second result is with the program above.
root@vgoip:~# ls -l patch-3.6.*
-rw-r--r-- 1 root root 64011 Aug 24 12:01 patch-3.6.2.gz
-rw-r--r-- 1 root root 94131 Aug 24 12:01 patch-3.6.3.gz
root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz
root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
5fd77b24e68bb24dcc72d6e57c64790e patch-3.6.3.gz
After investivation, it appears that sendfile() sends the files by blocks
of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
is reset as if it was the end of the file.
This patch adds SPLICE_F_MORE to the flags when more data is pending.
With the patch applied, we get the correct sums:
root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz
root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 23a9a7dc6f75393a21c0378d13b380e46907b877
Author: James Hogan <james.hogan@imgtec.com>
Date: Wed Nov 11 14:21:20 2015 +0000
MIPS: KVM: Uninit VCPU in vcpu_create error path
commit 585bb8f9a5e592f2ce7abbe5ed3112d5438d2754 upstream.
If either of the memory allocations in kvm_arch_vcpu_create() fail, the
vcpu which has been allocated and kvm_vcpu_init'd doesn't get uninit'd
in the error handling path. Add a call to kvm_vcpu_uninit() to fix this.
Fixes: 669e846e6c4e ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b0fc4511317bbc71ca8bbcf032013ffbfe0a2bc
Author: James Hogan <james.hogan@imgtec.com>
Date: Wed Nov 11 14:21:19 2015 +0000
MIPS: KVM: Fix CACHE immediate offset sign extension
commit c5c2a3b998f1ff5a586f9d37e154070b8d550d17 upstream.
The immediate field of the CACHE instruction is signed, so ensure that
it gets sign extended by casting it to an int16_t rather than just
masking the low 16 bits.
Fixes: e685c689f3a8 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5525dd65cd8e4f80ede26993f6f665df7eeec1dc
Author: James Hogan <james.hogan@imgtec.com>
Date: Wed Nov 11 14:21:18 2015 +0000
MIPS: KVM: Fix ASID restoration logic
commit 002374f371bd02df864cce1fe85d90dc5b292837 upstream.
ASID restoration on guest resume should determine the guest execution
mode based on the guest Status register rather than bit 30 of the guest
PC.
Fix the two places in locore.S that do this, loading the guest status
from the cop0 area. Note, this assembly is specific to the trap &
emulate implementation of KVM, so it doesn't need to check the
supervisor bit as that mode is not implemented in the guest.
Fixes: b680f70fc111 ("KVM/MIPS32: Entry point for trampolining to...")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1630624d5387a837112fb3663fe5daa22c680267
Author: Hariprasad S <hariprasad@chelsio.com>
Date: Fri Dec 11 13:59:17 2015 +0530
iw_cxgb3: Fix incorrectly returning error on success
commit 67f1aee6f45059fd6b0f5b0ecb2c97ad0451f6b3 upstream.
The cxgb3_*_send() functions return NET_XMIT_ values, which are
positive integers values. So don't treat positive return values
as an error.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
[a pox on developers and maintainers who do not cc: stable for bug fixes like this - gregkh]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b048b93f0e5b249b0add0bde6ec7ec75a07eac9c
Author: Corey Wright <undefined@pobox.com>
Date: Sun Feb 28 02:42:39 2016 -0600
proc: Fix ptrace-based permission checks for accessing task maps
Modify mm_access() calls in fs/proc/task_mmu.c and fs/proc/task_nommu.c to
have the mode include PTRACE_MODE_FSCREDS so accessing /proc/pid/maps and
/proc/pid/pagemap is not denied to all users.
In backporting upstream commit caaee623 to pre-3.18 kernel versions it was
overlooked that mm_access() is used in fs/proc/task_*mmu.c as those calls
were removed in 3.18 (by upstream commit 29a40ace) and did not exist at the
time of the original commit.
Signed-off-by: Corey Wright <undefined@pobox.com>
Acked-by: Jann Horn <jann@thejh.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db9b6792736a57291901032e4a2036dfc9f0ab95
Author: Bjørn Mork <bjorn@mork.no>
Date: Fri Feb 12 16:40:00 2016 +0100
USB: option: add "4G LTE usb-modem U901"
commit d061c1caa31d4d9792cfe48a2c6b309a0e01ef46 upstream.
Thomas reports:
T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=05c6 ProdID=6001 Rev=00.00
S: Manufacturer=USB Modem
S: Product=USB Modem
S: SerialNumber=1234567890ABCDEF
C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b920f51b68e4d965b51cab8060fae66eeb21f218
Author: Andrey Skvortsov <andrej.skvortzov@gmail.com>
Date: Fri Jan 29 00:07:30 2016 +0300
USB: option: add support for SIM7100E
commit 3158a8d416f4e1b79dcc867d67cb50013140772c upstream.
$ lsusb:
Bus 001 Device 101: ID 1e0e:9001 Qualcomm / Option
$ usb-devices:
T: Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#=101 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 2
P: Vendor=1e0e ProdID=9001 Rev= 2.32
S: Manufacturer=SimTech, Incorporated
S: Product=SimTech, Incorporated
S: SerialNumber=0123456789ABCDEF
C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:* If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
The last interface (6) is used for Android Composite ADB interface.
Serial port layout:
0: QCDM/DIAG
1: NMEA
2: AT
3: AT/PPP
4: audio
Signed-off-by: Andrey Skvortsov <andrej.skvortzov@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9195b27396cbf8459e19593ec34586e66857ab01
Author: Ken Lin <ken.lin@advantech.com.tw>
Date: Mon Feb 1 14:57:25 2016 -0500
USB: cp210x: add IDs for GE B650V3 and B850V3 boards
commit 6627ae19385283b89356a199d7f03c75ba35fb29 upstream.
Add USB ID for cp2104/5 devices on GE B650v3 and B850v3 boards.
Signed-off-by: Ken Lin <ken.lin@advantech.com.tw>
Signed-off-by: Akshay Bhat <akshay.bhat@timesys.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fadc5c1769ba24cd8498359d93e31408b0e0fbb9
Author: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com>
Date: Tue Dec 22 17:29:16 2015 +0100
can: ems_usb: Fix possible tx overflow
commit 90cfde46586d2286488d8ed636929e936c0c9ab2 upstream.
This patch fixes the problem that more CAN messages could be sent to the
interface as could be send on the CAN bus. This was more likely for slow baud
rates. The sleeping _start_xmit was woken up in the _write_bulk_callback. Under
heavy TX load this produced another bulk transfer without checking the
free_slots variable and hence caused the overflow in the interface.
Signed-off-by: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 86d80ecd4f96b63103abfd1269855a8a3b9d47cc
Author: Nikolay Borisov <kernel@kyup.com>
Date: Thu Dec 17 18:03:35 2015 +0200
dm thin: fix race condition when destroying thin pool workqueue
commit 18d03e8c25f173f4107a40d0b8c24defb6ed69f3 upstream.
When a thin pool is being destroyed delayed work items are
cancelled using cancel_delayed_work(), which doesn't guarantee that on
return the delayed item isn't running. This can cause the work item to
requeue itself on an already destroyed workqueue. Fix this by using
cancel_delayed_work_sync() which guarantees that on return the work item
is not running anymore.
Fixes: 905e51b39a555 ("dm thin: commit outstanding data every second")
Fixes: 85ad643b7e7e5 ("dm thin: add timeout to stop out-of-data-space mode holding IO forever")
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9bb86db161a368f423958981869d17c53ae5f395
Author: Joe Thornber <ejt@redhat.com>
Date: Wed Dec 9 16:23:24 2015 +0000
dm thin metadata: fix bug when taking a metadata snapshot
commit 49e99fc717f624aa75ca755d6e7bc029efd3f0e9 upstream.
When you take a metadata snapshot the btree roots for the mapping and
details tree need to have their reference counts incremented so they
persist for the lifetime of the metadata snap.
The roots being incremented were those currently written in the
superblock, which could possibly be out of date if concurrent IO is
triggering new mappings, breaking of sharing, etc.
Fix this by performing a commit with the metadata lock held while taking
a metadata snapshot.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9ee0d9ad9309385fd877bf7f5a762d4d3b5a6462
Author: Ingo Molnar <mingo@kernel.org>
Date: Tue Mar 3 07:34:33 2015 +0100
efi: Disable interrupts around EFI calls, not in the epilog/prolog calls
commit 23a0d4e8fa6d3a1d7fb819f79bcc0a3739c30ba9 upstream.
Tapasweni Pathak reported that we do a kmalloc() in efi_call_phys_prolog()
on x86-64 while having interrupts disabled, which is a big no-no, as
kmalloc() can sleep.
Solve this by removing the irq disabling from the prolog/epilog calls
around EFI calls: it's unnecessary, as in this stage we are single
threaded in the boot thread, and we don't ever execute this from
interrupt contexts.
Reported-by: Tapasweni Pathak <tapaswenipathak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[ luis: backported to 3.10: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2419de63e77bf6f63b5e67a57b1fa7cf535c4ca
Author: Dave Airlie <airlied@redhat.com>
Date: Thu Aug 20 10:13:55 2015 +1000
drm/radeon: fix hotplug race at startup
commit 7f98ca454ad373fc1b76be804fa7138ff68c1d27 upstream.
We apparantly get a hotplug irq before we've initialised
modesetting,
[drm] Loading R100 Microcode
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c125f56f>] __mutex_lock_slowpath+0x23/0x91
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: radeon(+) drm_kms_helper ttm drm i2c_algo_bit backlight pcspkr psmouse evdev sr_mod input_leds led_class cdrom sg parport_pc parport floppy intel_agp intel_gtt lpc_ich acpi_cpufreq processor button mfd_core agpgart uhci_hcd ehci_hcd rng_core snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm usbcore usb_common i2c_i801 i2c_core snd_timer snd soundcore thermal_sys
CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted 4.2.0-rc7-00015-gbf67402 #111
Hardware name: MicroLink /D850MV , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003
Workqueue: events radeon_hotplug_work_func [radeon]
task: f6ca5900 ti: f6d3e000 task.ti: f6d3e000
EIP: 0060:[<c125f56f>] EFLAGS: 00010282 CPU: 0
EIP is at __mutex_lock_slowpath+0x23/0x91
EAX: 00000000 EBX: f5e900fc ECX: 00000000 EDX: fffffffe
ESI: f6ca5900 EDI: f5e90100 EBP: f5e90000 ESP: f6d3ff0c
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 36f61000 CR4: 000006d0
Stack:
f5e90100 00000000 c103c4c1 f6d2a5a0 f5e900fc f6df394c c125f162 f8b0faca
f6d2a5a0 c138ca00 f6df394c f7395600 c1034741 00d40000 00000000 f6d2a5a0
c138ca00 f6d2a5b8 c138ca10 c1034b58 00000001 f6d40000 f6ca5900 f6d0c940
Call Trace:
[<c103c4c1>] ? dequeue_task_fair+0xa4/0xb7
[<c125f162>] ? mutex_lock+0x9/0xa
[<f8b0faca>] ? radeon_hotplug_work_func+0x17/0x57 [radeon]
[<c1034741>] ? process_one_work+0xfc/0x194
[<c1034b58>] ? worker_thread+0x18d/0x218
[<c10349cb>] ? rescuer_thread+0x1d5/0x1d5
[<c103742a>] ? kthread+0x7b/0x80
[<c12601c0>] ? ret_from_kernel_thread+0x20/0x30
[<c10373af>] ? init_completion+0x18/0x18
Code: 42 08 e8 8e a6 dd ff c3 57 56 53 83 ec 0c 8b 35 48 f7 37 c1 8b 10 4a 74 1a 89 c3 8d 78 04 8b 40 08 89 63
Reported-and-Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8719b00b6bcbf3acf3d8c1efebf9e9743d1d2511
Author: Kamal Mostafa <kamal@canonical.com>
Date: Wed Nov 11 14:25:34 2015 -0800
tools: Add a "make all" rule
commit f6ba98c5dc78708cb7fd29950c4a50c4c7e88f95 upstream.
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Pali Rohar <pali.rohar@gmail.com>
Cc: Roberta Dobrescu <roberta.dobrescu@gmail.com>
Link: http://lkml.kernel.org/r/1447280736-2161-2-git-send-email-kamal@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[ kamal: backport to 3.10-stable: build all tools for this version ]
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9fac660099785accf7c99670c9cceb096098e820
Author: Zheng Liu <wenqing.lz@taobao.com>
Date: Sun Nov 29 17:21:57 2015 -0800
bcache: unregister reboot notifier if bcache fails to unregister device
commit 2ecf0cdb2b437402110ab57546e02abfa68a716b upstream.
In bcache_init() function it forgot to unregister reboot notifier if
bcache fails to unregister a block device. This commit fixes this.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Tested-by: Joshua Schmid <jschmid@suse.com>
Tested-by: Eric Wheeler <bcache@linux.ewheeler.net>
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f79d019c4b9692afe3a383df849b927097ff4e1a
Author: Andrey Vagin <avagin@openvz.org>
Date: Wed Jan 29 19:34:14 2014 +0100
netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
Lets look at destroy_conntrack:
hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.
But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().
Florian Westphal suggested to check the confirm bit too. I think it's
right.
task 1 task 2 task 3
nf_conntrack_find_get
____nf_conntrack_find
destroy_conntrack
hlist_nulls_del_rcu
nf_conntrack_free
kmem_cache_free
__nf_conntrack_alloc
kmem_cache_alloc
memset(&ct->tuplehash[IP_CT_DIR_MAX],
if (nf_ct_is_dying(ct))
if (!nf_ct_tuple_equal()
I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.
<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>] [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622] [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697] [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770] [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843] [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919] [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063] [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207] [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277] [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346] [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419] [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491] [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562] [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638] [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712] [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785] [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858] [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081] [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151] [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229] [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303] [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378] [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454] [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531] [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607] [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f5bc010e9feef21555b9870ee4b5b06c9feae73
Author: Egbert Eich <eich@suse.de>
Date: Wed Jun 11 14:59:55 2014 +0200
drm/ast: Initialized data needed to map fbdev memory
commit 28fb4cb7fa6f63dc2fbdb5f2564dcbead8e3eee0 upstream.
Due to a missing initialization there was no way to map fbdev memory.
Thus for example using the Xserver with the fbdev driver failed.
This fix adds initialization for fix.smem_start and fix.smem_len
in the fb_info structure, which fixes this problem.
Requested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Egbert Eich <eich@suse.de>
[pulled from SuSE tree by me - airlied]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3e07084a8e08c4d4a02ef352e20419ba9835149
Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Date: Mon Feb 15 12:36:14 2016 -0500
tracepoints: Do not trace when cpu is offline
commit f37755490fe9bf76f6ba1d8c6591745d3574a6a6 upstream.
The tracepoint infrastructure uses RCU sched protection to enable and
disable tracepoints safely. There are some instances where tracepoints are
used in infrastructure code (like kfree()) that get called after a CPU is
going offline, and perhaps when it is coming back online but hasn't been
registered yet.
This can probuce the following warning:
[ INFO: suspicious RCU usage. ]
4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
-------------------------------
include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/8/0.
stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34
Call Trace:
[c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
[c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
[c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
[c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
[c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
[c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
[c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
[c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
[c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
[c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
[c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
[c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
This warning is not a false positive either. RCU is not protecting code that
is being executed while the CPU is offline.
Instead of playing "whack-a-mole(TM)" and adding conditional statements to
the tracepoints we find that are used in this instance, simply add a
cpu_online() test to the tracepoint code where the tracepoint will be
ignored if the CPU is offline.
Use of raw_smp_processor_id() is fine, as there should never be a case where
the tracepoint code goes from running on a CPU that is online and suddenly
gets migrated to a CPU that is offline.
Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org
Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | |
|
| | |
|
| |
|