Skip to content

Commit

Permalink
Allow target to handle STRICT floating-point nodes
Browse files Browse the repository at this point in the history
The ISD::STRICT_ nodes used to implement the constrained floating-point
intrinsics are currently never passed to the target back-end, which makes
it impossible to handle them correctly (e.g. mark instructions are depending
on a floating-point status and control register, or mark instructions as
possibly trapping).

This patch allows the target to use setOperationAction to switch the action
on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code
will stop converting the STRICT nodes to regular floating-point nodes, but
instead pass the STRICT nodes to the target using normal SelectionDAG
matching rules.

To avoid having the back-end duplicate all the floating-point instruction
patterns to handle both strict and non-strict variants, we make the MI
codegen explicitly aware of the floating-point exceptions by introducing
two new concepts:

- A new MCID flag "mayRaiseFPException" that the target should set on any
  instruction that possibly can raise FP exception according to the
  architecture definition.
- A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI
  instruction resulting from expansion of any constrained FP intrinsic.

Any MI instruction that is *both* marked as mayRaiseFPException *and*
FPExcept then needs to be considered as raising exceptions by MI-level
codegen (e.g. scheduling).

Setting those two new flags is straightforward. The mayRaiseFPException
flag is simply set via TableGen by marking all relevant instruction
patterns in the .td files.

The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes
in the SelectionDAG, and gets inherited in the MachineSDNode nodes created
from it during instruction selection. The flag is then transfered to an
MIFlag when creating the MI from the MachineSDNode. This is handled just
like fast-math flags like no-nans are handled today.

This patch includes both common code changes required to implement the
new features, and the SystemZ implementation.

Reviewed By: andrew.w.kaylor

Differential Revision: https://reviews.llvm.org/D55506

llvm-svn: 362663
  • Loading branch information
uweigand committed Jun 5, 2019
1 parent 2f94203 commit 6c5d5ce
Show file tree
Hide file tree
Showing 82 changed files with 5,788 additions and 372 deletions.
15 changes: 14 additions & 1 deletion llvm/include/llvm/CodeGen/MachineInstr.h
Original file line number Diff line number Diff line change
Expand Up @@ -102,8 +102,10 @@ class MachineInstr
// no unsigned wrap.
NoSWrap = 1 << 12, // Instruction supports binary operator
// no signed wrap.
IsExact = 1 << 13 // Instruction supports division is
IsExact = 1 << 13, // Instruction supports division is
// known to be exact.
FPExcept = 1 << 14, // Instruction may raise floating-point
// exceptions.
};

private:
Expand Down Expand Up @@ -830,6 +832,17 @@ class MachineInstr
return mayLoad(Type) || mayStore(Type);
}

/// Return true if this instruction could possibly raise a floating-point
/// exception. This is the case if the instruction is a floating-point
/// instruction that can in principle raise an exception, as indicated
/// by the MCID::MayRaiseFPException property, *and* at the same time,
/// the instruction is used in a context where we expect floating-point
/// exceptions might be enabled, as indicated by the FPExcept MI flag.
bool mayRaiseFPException() const {
return hasProperty(MCID::MayRaiseFPException) &&
getFlag(MachineInstr::MIFlag::FPExcept);
}

//===--------------------------------------------------------------------===//
// Flags that indicate whether an instruction can be modified by a method.
//===--------------------------------------------------------------------===//
Expand Down
17 changes: 15 additions & 2 deletions llvm/include/llvm/CodeGen/SelectionDAGNodes.h
Original file line number Diff line number Diff line change
Expand Up @@ -368,14 +368,21 @@ struct SDNodeFlags {
bool ApproximateFuncs : 1;
bool AllowReassociation : 1;

// We assume instructions do not raise floating-point exceptions by default,
// and only those marked explicitly may do so. We could choose to represent
// this via a positive "FPExcept" flags like on the MI level, but having a
// negative "NoFPExcept" flag here (that defaults to true) makes the flag
// intersection logic more straightforward.
bool NoFPExcept : 1;

public:
/// Default constructor turns off all optimization flags.
SDNodeFlags()
: AnyDefined(false), NoUnsignedWrap(false), NoSignedWrap(false),
Exact(false), NoNaNs(false), NoInfs(false),
NoSignedZeros(false), AllowReciprocal(false), VectorReduction(false),
AllowContract(false), ApproximateFuncs(false),
AllowReassociation(false) {}
AllowReassociation(false), NoFPExcept(true) {}

/// Propagate the fast-math-flags from an IR FPMathOperator.
void copyFMF(const FPMathOperator &FPMO) {
Expand Down Expand Up @@ -438,6 +445,10 @@ struct SDNodeFlags {
setDefined();
AllowReassociation = b;
}
void setFPExcept(bool b) {
setDefined();
NoFPExcept = !b;
}

// These are accessors for each flag.
bool hasNoUnsignedWrap() const { return NoUnsignedWrap; }
Expand All @@ -451,9 +462,10 @@ struct SDNodeFlags {
bool hasAllowContract() const { return AllowContract; }
bool hasApproximateFuncs() const { return ApproximateFuncs; }
bool hasAllowReassociation() const { return AllowReassociation; }
bool hasFPExcept() const { return !NoFPExcept; }

bool isFast() const {
return NoSignedZeros && AllowReciprocal && NoNaNs && NoInfs &&
return NoSignedZeros && AllowReciprocal && NoNaNs && NoInfs && NoFPExcept &&
AllowContract && ApproximateFuncs && AllowReassociation;
}

Expand All @@ -473,6 +485,7 @@ struct SDNodeFlags {
AllowContract &= Flags.AllowContract;
ApproximateFuncs &= Flags.ApproximateFuncs;
AllowReassociation &= Flags.AllowReassociation;
NoFPExcept &= Flags.NoFPExcept;
}
};

Expand Down
6 changes: 6 additions & 0 deletions llvm/include/llvm/MC/MCInstrDesc.h
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ enum Flag {
FoldableAsLoad,
MayLoad,
MayStore,
MayRaiseFPException,
Predicable,
NotDuplicable,
UnmodeledSideEffects,
Expand Down Expand Up @@ -403,6 +404,11 @@ class MCInstrDesc {
/// may not actually modify anything, for example.
bool mayStore() const { return Flags & (1ULL << MCID::MayStore); }

/// Return true if this instruction may raise a floating-point exception.
bool mayRaiseFPException() const {
return Flags & (1ULL << MCID::MayRaiseFPException);
}

/// Return true if this instruction has side
/// effects that are not modeled by other flags. This does not return true
/// for instructions whose effects are captured by:
Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/Target/Target.td
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,7 @@ class Instruction {
bit canFoldAsLoad = 0; // Can this be folded as a simple memory operand?
bit mayLoad = ?; // Is it possible for this inst to read memory?
bit mayStore = ?; // Is it possible for this inst to write memory?
bit mayRaiseFPException = 0; // Can this raise a floating-point exception?
bit isConvertibleToThreeAddress = 0; // Can this 2-addr instruction promote?
bit isCommutable = 0; // Is this 3 operand instruction commutable?
bit isTerminator = 0; // Is this part of the terminator for a basic block?
Expand Down
115 changes: 115 additions & 0 deletions llvm/include/llvm/Target/TargetSelectionDAG.td
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,53 @@ def fp_to_uint : SDNode<"ISD::FP_TO_UINT" , SDTFPToIntOp>;
def f16_to_fp : SDNode<"ISD::FP16_TO_FP" , SDTIntToFPOp>;
def fp_to_f16 : SDNode<"ISD::FP_TO_FP16" , SDTFPToIntOp>;

def strict_fadd : SDNode<"ISD::STRICT_FADD",
SDTFPBinOp, [SDNPHasChain, SDNPCommutative]>;
def strict_fsub : SDNode<"ISD::STRICT_FSUB",
SDTFPBinOp, [SDNPHasChain]>;
def strict_fmul : SDNode<"ISD::STRICT_FMUL",
SDTFPBinOp, [SDNPHasChain, SDNPCommutative]>;
def strict_fdiv : SDNode<"ISD::STRICT_FDIV",
SDTFPBinOp, [SDNPHasChain]>;
def strict_frem : SDNode<"ISD::STRICT_FREM",
SDTFPBinOp, [SDNPHasChain]>;
def strict_fma : SDNode<"ISD::STRICT_FMA",
SDTFPTernaryOp, [SDNPHasChain]>;
def strict_fsqrt : SDNode<"ISD::STRICT_FSQRT",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fsin : SDNode<"ISD::STRICT_FSIN",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fcos : SDNode<"ISD::STRICT_FCOS",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fexp2 : SDNode<"ISD::STRICT_FEXP2",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fpow : SDNode<"ISD::STRICT_FPOW",
SDTFPBinOp, [SDNPHasChain]>;
def strict_flog2 : SDNode<"ISD::STRICT_FLOG2",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_frint : SDNode<"ISD::STRICT_FRINT",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fnearbyint : SDNode<"ISD::STRICT_FNEARBYINT",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fceil : SDNode<"ISD::STRICT_FCEIL",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_ffloor : SDNode<"ISD::STRICT_FFLOOR",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fround : SDNode<"ISD::STRICT_FROUND",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_ftrunc : SDNode<"ISD::STRICT_FTRUNC",
SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fminnum : SDNode<"ISD::STRICT_FMINNUM",
SDTFPBinOp, [SDNPHasChain,
SDNPCommutative, SDNPAssociative]>;
def strict_fmaxnum : SDNode<"ISD::STRICT_FMAXNUM",
SDTFPBinOp, [SDNPHasChain,
SDNPCommutative, SDNPAssociative]>;
def strict_fpround : SDNode<"ISD::STRICT_FP_ROUND",
SDTFPRoundOp, [SDNPHasChain]>;
def strict_fpextend : SDNode<"ISD::STRICT_FP_EXTEND",
SDTFPExtendOp, [SDNPHasChain]>;

def setcc : SDNode<"ISD::SETCC" , SDTSetCC>;
def select : SDNode<"ISD::SELECT" , SDTSelect>;
def vselect : SDNode<"ISD::VSELECT" , SDTVSelect>;
Expand Down Expand Up @@ -1177,6 +1224,74 @@ def setle : PatFrag<(ops node:$lhs, node:$rhs),
def setne : PatFrag<(ops node:$lhs, node:$rhs),
(setcc node:$lhs, node:$rhs, SETNE)>;

// Convenience fragments to match both strict and non-strict fp operations
def any_fadd : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fadd node:$lhs, node:$rhs),
(fadd node:$lhs, node:$rhs)]>;
def any_fsub : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fsub node:$lhs, node:$rhs),
(fsub node:$lhs, node:$rhs)]>;
def any_fmul : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fmul node:$lhs, node:$rhs),
(fmul node:$lhs, node:$rhs)]>;
def any_fdiv : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fdiv node:$lhs, node:$rhs),
(fdiv node:$lhs, node:$rhs)]>;
def any_frem : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_frem node:$lhs, node:$rhs),
(frem node:$lhs, node:$rhs)]>;
def any_fma : PatFrags<(ops node:$src1, node:$src2, node:$src3),
[(strict_fma node:$src1, node:$src2, node:$src3),
(fma node:$src1, node:$src2, node:$src3)]>;
def any_fsqrt : PatFrags<(ops node:$src),
[(strict_fsqrt node:$src),
(fsqrt node:$src)]>;
def any_fsin : PatFrags<(ops node:$src),
[(strict_fsin node:$src),
(fsin node:$src)]>;
def any_fcos : PatFrags<(ops node:$src),
[(strict_fcos node:$src),
(fcos node:$src)]>;
def any_fexp2 : PatFrags<(ops node:$src),
[(strict_fexp2 node:$src),
(fexp2 node:$src)]>;
def any_fpow : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fpow node:$lhs, node:$rhs),
(fpow node:$lhs, node:$rhs)]>;
def any_flog2 : PatFrags<(ops node:$src),
[(strict_flog2 node:$src),
(flog2 node:$src)]>;
def any_frint : PatFrags<(ops node:$src),
[(strict_frint node:$src),
(frint node:$src)]>;
def any_fnearbyint : PatFrags<(ops node:$src),
[(strict_fnearbyint node:$src),
(fnearbyint node:$src)]>;
def any_fceil : PatFrags<(ops node:$src),
[(strict_fceil node:$src),
(fceil node:$src)]>;
def any_ffloor : PatFrags<(ops node:$src),
[(strict_ffloor node:$src),
(ffloor node:$src)]>;
def any_fround : PatFrags<(ops node:$src),
[(strict_fround node:$src),
(fround node:$src)]>;
def any_ftrunc : PatFrags<(ops node:$src),
[(strict_ftrunc node:$src),
(ftrunc node:$src)]>;
def any_fmaxnum : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fmaxnum node:$lhs, node:$rhs),
(fmaxnum node:$lhs, node:$rhs)]>;
def any_fminnum : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fminnum node:$lhs, node:$rhs),
(fminnum node:$lhs, node:$rhs)]>;
def any_fpround : PatFrags<(ops node:$src),
[(strict_fpround node:$src),
(fpround node:$src)]>;
def any_fpextend : PatFrags<(ops node:$src),
[(strict_fpextend node:$src),
(fpextend node:$src)]>;

multiclass binary_atomic_op_ord<SDNode atomic_op> {
def #NAME#_monotonic : PatFrag<(ops node:$ptr, node:$val),
(!cast<SDPatternOperator>(#NAME) node:$ptr, node:$val)> {
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,6 @@ bool InstructionSelector::isObviouslySafeToFold(MachineInstr &MI,
std::next(MI.getIterator()) == IntoMI.getIterator())
return true;

return !MI.mayLoadOrStore() && !MI.hasUnmodeledSideEffects() &&
empty(MI.implicit_operands());
return !MI.mayLoadOrStore() && !MI.mayRaiseFPException() &&
!MI.hasUnmodeledSideEffects() && empty(MI.implicit_operands());
}
3 changes: 2 additions & 1 deletion llvm/lib/CodeGen/ImplicitNullChecks.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,8 @@ class ImplicitNullChecks : public MachineFunctionPass {
} // end anonymous namespace

bool ImplicitNullChecks::canHandle(const MachineInstr *MI) {
if (MI->isCall() || MI->hasUnmodeledSideEffects())
if (MI->isCall() || MI->mayRaiseFPException() ||
MI->hasUnmodeledSideEffects())
return false;
auto IsRegMask = [](const MachineOperand &MO) { return MO.isRegMask(); };
(void)IsRegMask;
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/CodeGen/MIRParser/MILexer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,7 @@ static MIToken::TokenKind getIdentifierKind(StringRef Identifier) {
.Case("nuw" , MIToken::kw_nuw)
.Case("nsw" , MIToken::kw_nsw)
.Case("exact" , MIToken::kw_exact)
.Case("fpexcept", MIToken::kw_fpexcept)
.Case("debug-location", MIToken::kw_debug_location)
.Case("same_value", MIToken::kw_cfi_same_value)
.Case("offset", MIToken::kw_cfi_offset)
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/CodeGen/MIRParser/MILexer.h
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ struct MIToken {
kw_nuw,
kw_nsw,
kw_exact,
kw_fpexcept,
kw_debug_location,
kw_cfi_same_value,
kw_cfi_offset,
Expand Down
5 changes: 4 additions & 1 deletion llvm/lib/CodeGen/MIRParser/MIParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1136,7 +1136,8 @@ bool MIParser::parseInstruction(unsigned &OpCode, unsigned &Flags) {
Token.is(MIToken::kw_reassoc) ||
Token.is(MIToken::kw_nuw) ||
Token.is(MIToken::kw_nsw) ||
Token.is(MIToken::kw_exact)) {
Token.is(MIToken::kw_exact) ||
Token.is(MIToken::kw_fpexcept)) {
// Mine frame and fast math flags
if (Token.is(MIToken::kw_frame_setup))
Flags |= MachineInstr::FrameSetup;
Expand All @@ -1162,6 +1163,8 @@ bool MIParser::parseInstruction(unsigned &OpCode, unsigned &Flags) {
Flags |= MachineInstr::NoSWrap;
if (Token.is(MIToken::kw_exact))
Flags |= MachineInstr::IsExact;
if (Token.is(MIToken::kw_fpexcept))
Flags |= MachineInstr::FPExcept;

lex();
}
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/CodeGen/MIRPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -713,6 +713,8 @@ void MIPrinter::print(const MachineInstr &MI) {
OS << "nsw ";
if (MI.getFlag(MachineInstr::IsExact))
OS << "exact ";
if (MI.getFlag(MachineInstr::FPExcept))
OS << "fpexcept ";

OS << TII->getName(MI.getOpcode());
if (I < E)
Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/CodeGen/MachineCSE.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -382,7 +382,7 @@ bool MachineCSE::isCSECandidate(MachineInstr *MI) {

// Ignore stuff that we obviously can't move.
if (MI->mayStore() || MI->isCall() || MI->isTerminator() ||
MI->hasUnmodeledSideEffects())
MI->mayRaiseFPException() || MI->hasUnmodeledSideEffects())
return false;

if (MI->mayLoad()) {
Expand Down
4 changes: 3 additions & 1 deletion llvm/lib/CodeGen/MachineInstr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1178,7 +1178,7 @@ bool MachineInstr::isSafeToMove(AliasAnalysis *AA, bool &SawStore) const {
}

if (isPosition() || isDebugInstr() || isTerminator() ||
hasUnmodeledSideEffects())
mayRaiseFPException() || hasUnmodeledSideEffects())
return false;

// See if this instruction does a load. If so, we have to guarantee that the
Expand Down Expand Up @@ -1544,6 +1544,8 @@ void MachineInstr::print(raw_ostream &OS, ModuleSlotTracker &MST,
OS << "nsw ";
if (getFlag(MachineInstr::IsExact))
OS << "exact ";
if (getFlag(MachineInstr::FPExcept))
OS << "fpexcept ";

// Print the opcode name.
if (TII)
Expand Down
4 changes: 3 additions & 1 deletion llvm/lib/CodeGen/MachinePipeliner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,8 @@ static bool isSuccOrder(SUnit *SUa, SUnit *SUb) {
/// Return true if the instruction causes a chain between memory
/// references before and after it.
static bool isDependenceBarrier(MachineInstr &MI, AliasAnalysis *AA) {
return MI.isCall() || MI.hasUnmodeledSideEffects() ||
return MI.isCall() || MI.mayRaiseFPException() ||
MI.hasUnmodeledSideEffects() ||
(MI.hasOrderedMemoryRef() &&
(!MI.mayLoad() || !MI.isDereferenceableInvariantLoad(AA)));
}
Expand Down Expand Up @@ -3238,6 +3239,7 @@ bool SwingSchedulerDAG::isLoopCarriedDep(SUnit *Source, const SDep &Dep,

// Assume ordered loads and stores may have a loop carried dependence.
if (SI->hasUnmodeledSideEffects() || DI->hasUnmodeledSideEffects() ||
SI->mayRaiseFPException() || DI->mayRaiseFPException() ||
SI->hasOrderedMemoryRef() || DI->hasOrderedMemoryRef())
return true;

Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/CodeGen/PeepholeOptimizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1825,7 +1825,7 @@ ValueTrackerResult ValueTracker::getNextSourceFromBitcast() {
assert(Def->isBitcast() && "Invalid definition");

// Bail if there are effects that a plain copy will not expose.
if (Def->hasUnmodeledSideEffects())
if (Def->mayRaiseFPException() || Def->hasUnmodeledSideEffects())
return ValueTrackerResult();

// Bitcasts with more than one def are not supported.
Expand Down
13 changes: 13 additions & 0 deletions llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,7 @@ void ScheduleDAGInstrs::buildSchedGraph(AliasAnalysis *AA,
AAForDep = UseAA ? AA : nullptr;

BarrierChain = nullptr;
SUnit *FPBarrierChain = nullptr;

this->TrackLaneMasks = TrackLaneMasks;
MISUnitMap.clear();
Expand Down Expand Up @@ -871,9 +872,21 @@ void ScheduleDAGInstrs::buildSchedGraph(AliasAnalysis *AA,
addBarrierChain(NonAliasStores);
addBarrierChain(NonAliasLoads);

// Add dependency against previous FP barrier and reset FP barrier.
if (FPBarrierChain)
FPBarrierChain->addPredBarrier(BarrierChain);
FPBarrierChain = BarrierChain;

continue;
}

// Instructions that may raise FP exceptions depend on each other.
if (MI.mayRaiseFPException()) {
if (FPBarrierChain)
FPBarrierChain->addPredBarrier(SU);
FPBarrierChain = SU;
}

// If it's not a store or a variant load, we're done.
if (!MI.mayStore() &&
!(MI.mayLoad() && !MI.isDereferenceableInvariantLoad(AA)))
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -883,6 +883,9 @@ EmitMachineNode(SDNode *Node, bool IsClone, bool IsCloned,

if (Flags.hasExact())
MI->setFlag(MachineInstr::MIFlag::IsExact);

if (Flags.hasFPExcept())
MI->setFlag(MachineInstr::MIFlag::FPExcept);
}

// Emit all of the actual operands of this instruction, adding them to the
Expand Down
Loading

0 comments on commit 6c5d5ce

Please sign in to comment.