Document number: | N4126 |
Date: | 2014-07-29 |
Project: | Programming Language C++, Language Evolution Working Group |
Reply-to: | Oleg Smolsky <[email protected]> |
N3950 was presented to the Evolution WG at the Rapperswil meeting and the response was very positive. A later revision, N4114 was amended to handle the following points requested at the meeting:
This proposal makes the following changes after the technical review on the c++std-ext list:
Equality for composite types is an age-old problem which effects an overwhelming chunk of the programming community. Specifically, the users that program with Regular1 types and expect that they would compose naturally.
The proposal presents means of generating default equality/inequality for Regular, as well as relational operators for Totally Ordered2 user-defined types. Such types should be trivial to implement and easy to understand.
Specifically, I argue that:
operator==()
implies that the
type is Regular. I expect that the equality is defined to return true
IFF both
objects represent the same value (this relation is transitive, reflexive and symmetric)
operator<()
implies that the type is Totally Ordered.
Finally, the feature is strictly "opt in" so that semantics of existing code remain intact.
Simple means of doing equality are really useful in modern C++ code that operates with types composed of Regular members. The definition of equality is trivial in such cases - member-wise comparison. Inequality should then be generated as its boolean negation.
This proposal focuses on Regular and Totally Ordered types as they naturally compose. Such cases are becoming more prevalent as people program more with value types and so writing the same equality and relational operators becomes tiresome. This is especially true when trying to lexicographically compare members to achieve total ordering.
Consider the following trivial example where a C++ type represents some kind of user record:
struct user { uint32_t id, rank, position; std::string first_name, last_name; std::string address1, address2, city, state, country; uint32_t us_zip_code; friend bool operator==(const user &, const user &); friend bool operator!=(const user &, const user &); friend bool operator<(const user &, const user &); friend bool operator>=(const user &, const user &); friend bool operator>(const user &, const user &); friend bool operator<=(const user &, const user &); };
bool operator==(const user &a, const user &b) { return a.id == b.id && a.rank == b.rank && a.position == b.position && a.address1 == b.address1 && a.address2 == b.address2 && a.city == b.city && a.state == b.state && a.country == b.country && a.us_zip_code == b.us_zip_code; }
Also, the composite type is naturally Totally Ordered, yet that takes even more code:
bool operator<(const user &a, const user &b) { // I could implement the full lexicographical comparison of members manually, yet I // choose to cheat by using standard libraries return std::tie(a.id, a.rank, a.position, a.address1, a.address2, a.city, a.state, a.country, a.us_zip_code) < std::tie(b.id, b.rank, b.position, b.address1, b.address2, b.city, b.state, b.country, b.us_zip_code); }Specifically, this code, while technically required, suffers from the following issues:
It is vital that equal/unequal, less/more-or-equals and more/less-or-equal pairs
behave as boolean negations of each other. After all, we are building total ordering
and the world would make no sense
if both operator==()
and operator!=()
returned false!
As such, it is common to implement these operators in terms of each other.
Inequality for Regular types:
bool operator!=(const user &a, const user &b) { return !(a == b); }
Relational operators for Totally Ordered types:
bool operator>=(const user &a, const user &b) { return !(a < b); } bool operator>(const user &a, const user &b) { return b < a; } bool operator<=(const user &a, const user &b) { return !(a > b); }Notes:
operator<()
must remain transitive in its nature,
operator==()
is supposed to be symmetric, reflexive and transitive.
Member-wise generation of special functions is already present in the Standard (see Section 12), so it seems natural to extend the scope of generation and reuse the existing syntax.
The proposed syntax for generating the new explicitly defaulted non-member operators is as follows:
struct Thing { int a, b, c; std::string d; }; bool operator==(const Thing &, const Thing &)= default; bool operator!=(const Thing &, const Thing &)= default;
There are cases where members are private and so the operators need to be declared as friend. Consider the following syntax:
class AnotherThing { int a, b; public: // ... friend bool operator<(Thing, Thing) = default; friend bool operator>(Thing, Thing) = default; friend bool operator<=(Thing, Thing) = default; friend bool operator>=(Thing, Thing) = default; };
I feel this is a natural choice because:
Several committee members expressed a strong desire for a shorter form of notation that would radically reduce the amount of code it takes to declare the non-member operators. Here is the short-hand that extends to the long form defined above.
struct Thing { int a, b, c; std::string d; default: ==, !=, <, >, <=, >=; // defines the six non-member functions };
Notes:
It is possible to write a templated "CRTP" base class that implements equality and relational operators. For an example, see Boost.Operators.
Comment from Nevin Liber: I believe this breaks standard layout if you use private derivation (and it is unlikely that we will want public derivation, given the deprecation of things like unary_function and binary_function).
Other committee members mentioned "upcoming, to be specified" reflection facilities yet, I feel, a first-class language feature is needed now.
The Evolution WG was divided on the mutable
treatment. There were
two mutually exclusive views:
mutable
members from the comparison operators.mutable
members when doing comparisons (ie no special treatment)I prefer option (1) above, yet the only way to resolve the committee dead lock is to make code with such members ill-formed. The user would have to implement the comparison operators manually. The committee thus reserves an option to reconsider the decision at a later stage, as part of a follow up proposal.
The feedback on the c++std-ext
list included the "single member wrapper struct"
case where the author expects every overloaded operator of the wrapper to work consistently
to those of the member.
Consider the following user-defined type:
struct wrapper { double val; default: ==, !=, <, >, <=, >=; // defines the six non-member functions };Such a thing would be built from a
double
and it makes sense to build
the equality and relational operators based on the member's operators. This treatment
covers every possible case: total order, strict weak order and even partial order
(as the ambiguities in the last case are simply bridged to the caller).
Namely,
bool operator==(const wrapper &a, const wrapper &b) { return a.val == b.val; } bool operator!=(const wrapper &a, const wrapper &b) { return a.val != b.val; } bool operator<(const wrapper &a, const wrapper &b) { return a.val < b.val; } bool operator<=(const wrapper &a, const wrapper &b) { return a.val <= b.val; } bool operator>(const wrapper &a, const wrapper &b) { return a.val > b.val; } bool operator>=(const wrapper &a, const wrapper &b) { return a.val >= b.val; }
The original usecase for the proposal revolves around user-defined types that
contain many regular members. These types must receive memberwise implementations
of operator==()
and operator<()
and the other
operators may be derived.
Namely, consider the shortest implementation of operator>=()
:
bool operator>=(const thing &t1, const thing &t2) { return !(a < b); }
Notes on this option:
std::pair
and std::tuple
operator<()
Conclusion: the most consistent and straightforward option is to follow the dominating single-member case and generate each explicitly defaulted operator fully.
struct thing { int a, b, c; };
Option 1: use the members' strict weak order to implement
operator<()
.
This is consistent with std::pair
and std::tuple
.
bool operator<(const thing &t1, const thing &t2) { if (t1.a < t2.a) return true; if (t2.a < t1.a) return false; if (t1.b < t2.b) return true; if (t2.b < t1.b) return false; return t1.c < t2.c; }
Option 2: use the members' total order to implement operator<()
.
This puts an implicit dependency on operator==()
.
bool operator<(const thing &t1, const thing &t2) { if (t1.a != t2.a) return t1.a < t2.a; if (t1.b != t2.b) return t1.b < t2.b; return t1.c < t2.c; }
There are some built-in types that are not totally ordered or cannot always
be compared. Namely, <
is only defined for pointers of the same type
that refer to memory allocated from a single contiguous region, IEEE floating
point numbers have the NaN value and the comparisons are defined in a very special way.
Design decisions:
A function definition of the form:
attribute-specifier-seqopt decl-specifier-seqopt declarator virt-specifier-seqopt = default ;
is called an explicitly-defaulted definition. A function that is explicitly defaulted shall
— be a special member function, or an explicitly defaultable operator function. See [defaultable]
After 8.4.3 add a new section
8.4.4 Explicitly defaultable operator functions [defaultable]The following friend operator functions are explicitly defaultable:
- Non-member equality operators:
operator==()
,operator!=()
, see [class.equality]- Non-member relational operators:
operator<()
,operator>()
,operator<=()
,operator>=()
, see [class.relational]
The default constructor (12.1), copy constructor and copy assignment operator (12.8), move constructor and move assignment operator (12.8) and destructor (12.4) are special member functions. These, together with equality operators (12.10) and relational operators (12.11) may be explicitly defaulted as per [dcl.fct.def.default]
After 12.9 add a new section
12.10 Equality operators [class.equality]
- A class may provide overloaded
operator==()
andoperator!=()
as per [over.oper]. A default implementation of these non-member operators may be generated via the= default
notation as it may be explicitly defaulted as per [dcl.fct.def.default].- The defaulted
operator==()
definition is generated if and only if all sub-objects are fundamental types or compound types thereof, that provide operator==().- If a class with a defaulted
operator==()
has amutable
member, the program is ill-formed- The defaulted
operator==()
for class X shall take two arguments of type X by value or by const reference and return bool.- The explicitly defaulted non-member
operator==()
for a class X shall perform memberwise equality comparison of its subobjects. Namely, a comparison of the subobjects that have the same position in both objects against each other until one subobject is not equal to the other.Direct base classes of X are compared first, in the order of their declaration in the base-specifier-list, and then the immediate non-static data members of X are compared, in the order in which they were declared in the class definition.
Let x and y be the parameters of the defaulted operator function. Each subobject is compared in the manner appropriate to its type:
- if the subobject is of class type, as if by a call to
operator==()
with the subobject of x and the corresponding subobject of y as a function arguments (as if by explicit qualification; that is, ignoring any possible virtual overriding functions in more derived classes);- if the subobject is an array, each element is compared in the manner appropriate to the element type;
- if the subobject is of a scalar type, the built-in
==
operator is used.- The explicitly-defaulted non-member
operator!=()
for a non-union class shall be implemented in a manner described in (5) while callingoperator!=()
and the built-in!=
operator where appropriate.Example:
struct T { int a, b, c; std::string d; }; bool operator==(const T &, const T &) = default;Note, floating point values are regular only in the domain of normal values (outside of the NaN) and so the explicitly-defaulted non-member operators are only defined in that domain too.
After 12.10 add a new section
12.11 Relational operators [class.relational]
- A class may provide overloaded relational operators as per [over.oper]. A default implementation of a non-member relational operator may be generated via the
= default
notation as these may be explicitly defaulted as per [dcl.fct.def.default].- The defaulted
operator<()
definition is generated if and only if all sub-objects are fundamental types or compound types thereof, that provide operator<().- If a class with a defaulted
operator<()
has amutable
member, the program is ill-formed- The defaulted
operator<()
for class X shall take two arguments of type X by value or by const reference and return bool.- The explicitly-defaulted
operator<()
for a class X shall perform memberwise lexicographical comparison of its subobjects. Namely, a comparison of the subobjects that have the same position in both objects against each other until one subobject is not equivalent to the other. The result of comparing these first non-matching elements is the result of the function.Direct base classes of X are compared first, in the order of their declaration in the base-specifier-list, and then the immediate non-static data members of X are compared, in the order in which they were declared in the class definition.
Let x and y be the parameters of the defaulted operator function. Each subobject is compared in the manner appropriate to its type:
- if the subobject is of class type, as if by a call to
operator<()
with the subobject of x and the corresponding subobject of y as a function arguments (as if by explicit qualification; that is, ignoring any possible virtual overriding functions in more derived classes);- if the subobject is an array, each element is compared in the manner appropriate to the element type;
- if the subobject is of a scalar type, the built-in
<
operator is used.- An explicitly-defaulted non-member
operator>()
for a non-union class shall be implemented in a manner described in (5) while callingoperator>()
and the built-in>
operator where appropriate.- An explicitly-defaulted non-member
operator>=()
for a non-union class shall be implemented in a manner described in (5) while callingoperator>=()
and the built-in>=
operator where appropriate.- An explicitly-defaulted non-member
operator<=()
for a non-union class shall be implemented in a manner described in (5) while callingoperator<=()
and the built-in<=
operator where appropriate.Example:
struct T { int a, b; friend bool operator<(T, T) = default; };Note, pointer comparisons are only defined for a subset of values, floating point values are totally ordered only in the domain of normal values (outside of the NaN), so the explicitly-defaulted non-member operators are only defined in the domain of members' normal values.
After 12.11 add a new section
12.12 Explicitly defaulted equality and relational operators - short form [class.oper-short]
- A class may provide explicitly defaulted equality and relational operators as per [class.equality] and [class.relational] respectively. These non-member operators can also be generated via the short form of the notation:
default: [the coma-separated list of operators];
- The following six short-hand names map to the explicitly defaultable equality and relational operators:
==, !=, <, <=, >, >=
.- The implementation must expand each term of the short form into a full declaration subject to [class.equality] and [class.relational], while choosing how to pass the arguments in order to maximize performance.
Example:
struct Thing { int a, b, c; std::string d; default: ==, !=; // defines equality/inequality non-member functions };
I believe the fundamental idea comes from Alex Stepanov and his work3 on regular types. These types are generalizations of the built-in types, so they need to support copying, assignment, and comparison. The C++ language has natively supported the first two points from the beginning and this proposal attempts to address the last one.
I want to thank Andrew Sutton for the early feedback and guidance, as well as Bjarne Stroustrup for loudly asking for consensus on small, fundamental language cleanups that strive to make users' lives easier.
Editorial credits go to Daniel Krügler, Ville Voutilainen, Jens Maurer and Lawrence Crowl - thank you for helping with the technical specification!
Finally, many folks on the c++std-ext
list have provided valuable
advice and guidance. Thank you for the lively discussion and your help with
steering the design!
operator<()
is defined, operator>=()
is defined as its boolean negation
See "Elements of programming" by Alexander Stepanov and Paul McJones for a full treatment of Regular and Totally Ordered concepts.