C++ 98, 0A, and 1x

Axel Naumann on Fri, 07/24/2009 - 23:36

Hi!

We all use it, in different levels of frequency and perfection. Most of us love it for its speed, hate it for its syntax - but have no reason to think about how to improve it, and where we hit the limits of C++. A bit like The Truman Show. Let's have a look at C++ itself for a minute.

Of course there is a good reason for taking C++ for granted: C++ is impossible to change by you and me and CERN. It is "governed" by an ISO committee; they define what "the standard" is. They do what committees do, excessively, and then, once a while (1998, 2003, and soon, probably 2010, formally called C++ 0x, I'll call it C++ 0A) they decide to publish a new revision of the standard. What's in there is the result of endless discussions, based on a small set of fundamental rules. We had Walter Brown, a member of the committee, at CERN during this week, and he reported that one of the rules is: make C++ easier to use for novices.

I think that's a brilliant rule, not only because novices are the majority. But also because it allows C++ to suck people in: get them hooked as novices and once they feel accustomed they can start digging deeper. And finally it keeps the committee rooted in reality, not in computer scientists' frenzy.

So why am I mentioning the novice rule? Because I believe the C++ committee has forgotten that rule. It sounds like it might barely justify a minute of air time over a beer at a bar with your dearest C++ coder friend. But in fact it will have a huge impact on our code.

One of the reason why scripting languages like python, perl, bash, (visual) basic, ruby,... are so widely used is their clarity and readability, and because it's easy to express algorithms. Their ambiguity is in meaning (for the CPU, which is why you cannot compile them), not in syntax. They let you write the exact same expression for many cases. In C++ 0A it will be the other way around: you have a plethora of completely orthogonal ways to express the same thing.

Examples are return types

void f()

auto f() -> void

, member initialization via inline initializers

class A{ B b = B::f(); }

, constructors

A::A(): b(B::f()) {}

or even initializer lists

A::A(): b{B::f()} {}

. Those are just two examples; there is also the new for-loop syntax and strongly typed ("class") enumerations.

The second issue is overloading of meanings. We all know the keywords

delete

, the sibling of

new

, and

default

, the sibling of

switch

es'

case

. These two will now have an additional meaning! They can signal that e.g. a constructor is meant to be the compiler generated one (

default

) or that it is meant to not be available (

delete

Don't get me wrong: the idea is brilliant - but why o why reuse a keyword? I asked, and the answer was: "We must not break peoples' code". OK, I buy that: introducing e.g. "i" as a new keyword would break a whole lot of code. But on the other hand they did introduce several new keywords (

constexpr, decltype, nullptr, static_assert

). So the real reason is that

default

and

delete

were found to be a good enough fit for this use case, and no new keyword was needed. For

default

I agree: the context is completely different. But I can implement my own

operator delete

as a class member - so within

class A{...}

the keyword

delete

now has two completely separate meanings. That's bad, especially for novices. It's like "free, which is you don't pay, unless preceded by sugar or fat, or followed by people or minds or press or radical". Most languages have come up with separate words for these separate concepts. And that's just so much clearer. Especially for language novices! Maybe that guy who wrote "The people must be free." is a freakin' slave dealer after all! (It's James Albert Woodburn and no, he's not.)

So I believe C++ is not becoming easier at all; it will not help novices. They will now have to learn the 98/03 syntax and the 0A syntax. The complexity of syntax can increase (e.g. if people use lambda functions), and so code can become unreadable. Even computer programs will have a really, really, really hard time parsing C++ 0A code conforming to the 0A standard. I think the committee had better created a new language ++C++, instead of extending C++ with (nice!) new features and creating a unreadable, unparsable, ugly syntax. C++ already has a very intricate syntax and the way they want to patch new language features in now will make the syntax absurd.

// needs decl of total, a
auto f(int i[]{2,3}) const -> decltype(a->sum) {
  total=0;
  double value{4};
  auto o=[&, value](int x) -> decltype(total) {
    int z{x*x};
    return total += (z * value) + z;
  };
  int ret = o(i[0]);
  return ret * o(i[1]);
};

No, I am not certain that this code snippet will compile; I am new to C++ 0A. But do you know what the return value will be for f returning int? It's not easy to figure that out, right? Is this really still C++? For me there are two lessons to be learned:

Lesson I: If there is the slightest chance that someone with a different C++ taste than you might read your code then do not use 0A syntax, or if you are powerful and you rule then enforce it everywhere, in a unique version (i.e. not both versions of for-loops, not both versions of return type declarations). I predict that my daughter will be a better programmer than me before ROOT will have C++ 0A-only syntax in its headers. My daughter is currently 1.83 years old.
Lesson II: We should let the C++ committee know better what our issues are. We are pretty unique in our use of C++: there are not many cases where humongous amounts of C++ source code are meant to be used by thousands of people. Documentation, clarity, readability are key issues for us. If you want your code to be used you have to make it easy to read. I find C++ 0A's new features not helpful in that respect.

Lucky us! And lucky you still reading this paragraph! Bjarne Stroustrup, who has designed and implemented the C++ programming language, will visit CERN, beginning of September. He will give the "big" presentation we all want to see, but he was also interested to hear our view of C++: where do we hit the limits, where do we have problems with it. I am collecting comments from basically everyone writing C++ in the High Energy Physics context. We will be able discuss them in a small technical meeting with Bjarne Stroustrup. This is about leaving the stage for a second, Mr Truman. If there is anything that you would like to have mentioned, something that you believe makes C++ non-optimal in our context, then please comment, so we can ask HIM!

Happy coding!
Axel

Axel Naumann's blog

Comments

Submitted by Anonymous (not verified) on Tue, 09/08/2009 - 19:01 Permalink

Invalid C++03 does not make C++0A bad.

I'm obviously not a novice, but when trying to demonstrate a problem of C++0A compared to C++03, I do think you should do something that would be valid to do in C++03. (with c++03 syntax, of course)

But do you know what the return value will be for f returning int?

The standard, I think since its C ancestry, calls the behavior you're trying in the return line "Undefined".

o(i[0])*o(i[1]) writes (at least once) to the value "total" in the same expression that it reads it twice. So given that i[0] and i[1] have different values, you depend on the order of evaluation the compiler uses for that expression. (inlining can bring the possible results up from just 2)

I think that lambda needs a return type of ->declspec(total) rather than default to void to be valid (yeah yeah, syntax). Doing that should make the dependency on the type of total even clearer. This kind of dependency (what is the type of x and how that changes the behavior of code) is nothing new with C++0a.

Given that it returns an int, according to your statement, there will be a double (type of value) to int conversion somewhere; without a cast, those tend to give errors.

In short, your example shows unfamiliarity with C++0a more than real problems. I think most of the "problems" will get worked around easily, and genuinely bad ideas will see appropriately little use. (see throw specifications). And then I'll get to say the whole thing over again when c++1x is about to become reality :)

They will now have to learn the 98/03 syntax and the 0A syntax.

We* will work out which (complete) subset is the easiest to learn, and teach them that first. In theory, a bigger selection means the "optimal" choice may get better. At worst, it remains unaffected. (In practice, several "C++ teachers" still teach C with iostreams...) They will also have to learn to at least read the rest eventually, true.

--MaHuJa

Submitted by Anonymous (not verified) on Wed, 09/09/2009 - 16:37 Permalink

Re: Invalid C++03 does not make C++0A bad.

Hi MaHuJa!

Thanks so much for your terrific comments! I agree with all your big fixes - I changed the code accordingly! To quote Bjarne Stroustrup (literally): "The good think about compilers is exactly that they are pedantic - saves on debugging :-)" I already warned you that I am unfamiliar with 0x, and that shows. But I still believe that this code shows the allowed complexity of C++ 0x.

You argue that we select the kind of syntax we want - with the exception of reading code. And exactly the latter is the big issue!

In big environments like the LHC experiments' code you have to enforce a certain syntax. And usually that's done via nightly tests: the code is fine if the experiments' compilers eats it. What will happen when the experiments move to C++ 0x capable compilers? My expectation is that some code will use C++ 0x-only syntax. And given that there are tens of millions of lines of code it will probably be pretty much cover all the features.

My prediction is that we will have to include most of the C++ 0x features soon, or students will not be able to read their experiments' code. And this code (esp. the headers) is the main user interface for the experiments! Proof: even now you cannot understand ATLAS code if all you know is C plus iostream.

Bjarne Stroustrup always focused on these features being optional. I'd argue that without more restrictive syntax requirements for code knowing the syntax is mandatory to read it. So also here I agree with you! :-)

Cheers, Axel

Submitted by Anonymous (not verified) on Mon, 08/03/2009 - 09:44 Permalink

The new C++ syntax is most like D language

There is a new language named D: http://digitalmars.com/d/ . D1 is stable now. D2 is still under development. The most headache thing is that there is lack of libraries usable for D. From the TIOBE (http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html), we may know the popularity of D. :-) Obviously, it is impossible to rewrite ROOT using D. But how about an interface? Anyway, I'm still waiting until D and libraries along with it are mature.

Submitted by Anonymous (not verified) on Wed, 10/28/2009 - 13:54 Permalink

And what's the point? ROOT is

And what's the point? ROOT is so broken that you won't get much improvements from better language. Secondly C++ is awfully difficult to interface with. Generally one is required to define C-interface on the top of C++ one and call those functions. All object orientation will be lost on the way unless special precautions are taken.

Submitted by Anonymous (not verified) on Wed, 11/04/2009 - 21:48 Permalink

+1

I agree. That's right changing language won't improve anything

Submitted by Anonymous (not verified) on Fri, 10/30/2009 - 19:02 Permalink

Re: And what's the point?

Hi Another,

If you find a problem with ROOT then please submit a bug report; it's the only way we can fix it. If you are referring to CINT (which in my point of view is currently one of the most limiting parts of ROOT) then you might be interested to hear that "we are working on it" :-)

Interfacing C++ works actually pretty well, as long as you keep the API simple. But I partially agree - already STL (especially with Microsoft's implementation, e.g. debug vs. optimized) can completely screw up an interface.

And check the slides for which I posted the link here: there are several ways that improvements in C++ can help ROOT and its users!

Cheers, Axel

You are here

Comments