Herb Sutter's Blog, page 2

March 30, 2025

Crate-training Tiamat, un-calling Cthulhu:Taming the UB monsters in C++

For more background on safety and security issues related to C++, including definitions of “language safety” and “software security” and similar terms, see my March 2024 essay “C++ safety, in context.” This essay picks up our story where that one left off to bring us up to date with a specific focus on undefined behavior (aka UB).

This is a status update on improvements currently in progress for hardening and securing our C++ software.

The C++ community broadly has a lot of hardening work well underway. Across the industry, this includes work being done by individual vendors, that they are then contributing to the standardization process so C++ programmers can use it portably. In the standard, it includes things we have had for a while (UB-free constexpr compile-time code) to things we’ve done recently (in draft C++26: erroneous behavior, bounds-hardened standard library, and contracts for functional safety) to proposals we’re actively pursuing next (in progress: Bjarne Stroustrup’s profiles, Úlfar Erlingsson’s remote code execution hardening).

A common underlying thread of all this work is that each piece addresses more and more of C++’s undefined behavior (aka UB), and especially the UB most exploited by attackers. We’re addressing UB methodically, starting with addressing the common high-value cases that will do the most to harden our code: uninitialized variables, out-of-bounds access, pointer misuse, and the key UB cases that adversaries need to implement remote code execution. These are the weaknesses that attackers exploit, and that we are locking down to lock them out.

Common (dis)belief: “UB is just too central to C++, trying to improve it enough to matter is hopeless”

Tiamat and Cthulhu in a cage, with a happy person in front making a thumbs-up sign

For the sake of discussion, assume the cage is impervious to dragon breath and psionics. It’s just a metaphor.

Tech pundits still seem to commonly assume that UB is so fundamentally entangled in C++’s specification and programs that C++ will never be able to address enough UB to really matter. And it is true that it’s currently way too easy to accidentally let tendrils of silent UB slither pervasively throughout our C++ code.

Background in a nutshell: In C++, code that (usually accidentally) exercises UB is the primary root cause of our memory safety and security vulnerability issues. When a program contains UB, anything can happen; it’s common to call the whole thing “the UB dragon” and say “UB can reformat your hard drive or make demons fly out your nose” — hence the Tiamat and Cthulhu metaphors. Worse than those things, however, is that UB regularly leads to exploitable security vulnerabilities and other expensive-to-fix bugs. (For more details about UB, see the Appendix.)

So it’s valid to ask: Can and will C++ ever do enough about UB to make a major difference?

Summary and spoilers

In this post, I’m happy to report that serious taming of C++ UB is underway…

(1) Since C++11 in 2011, more and more C++ code has already become UB-free. Most people just didn’t notice.

Spoiler: All constexpr/consteval compile-time code is UB-free. As of C++26 almost the entire language and much of the standard library is available at compile time, and is UB-free when executed at compile time (but not when the code is executed at run time, hence the following additional work all of which is about run-time execution).

(2) Since March 2024, the draft C++26 standard has already removed key “low-hanging fruit” run-time UB cases that were the root cause of significant categories of security vulnerabilities.

Spoiler: In draft C++26, uninitialized local variables are no longer UB, and most common non-iterator bounds errors in the hardened standard library, such as for string and vector and string_view and span, will no longer be UB in a “hardened” implementation. (And C++26 also has language contracts for a different aspect of safety, namely functional safety for defensive programming to reduce bugs in general.)

(3) Now, we’re undertaking to add more tools and to systematically catalog and address run-time UB in the C++ language.

Spoiler: Addressing each case of UB statically where possible (at compile time), or with run-time checking where necessary. The primary tools: (a) C++26 erroneous behavior (EB); (b) Bjarne Stroustrup’s profiles and Gabriel Dos Reis’ profiles framework to opt into full safety by default and tactically opt out again where needed (sometimes you do want to breathe fire at a specific loop); and/or (c) applying C++26 contract assertions to check language features. EB and basic contract assertions are already part of C++26; profiles have work now underway focusing on implementation and deployment of the profiles framework and a few key profiles for experimentation across the C++ ecosystem. In addition, Úlfar Erlingsson is proposing a profile to surgically eliminate specifically the UB that attackers use to do remote code execution (RCE) which has the promise to eliminate many (and let developers opt into eliminating nearly all) malware exploits in recompiled C++ code.

If successful, these steps would achieve parity with the other modern memory-safe languages as measured by the number of security vulnerabilities, which would eliminate any safety-related reasons not to use C++. Note that leveling the playing field with other languages still means there are other security issues that need to be addressed too, in all languages, such as logic bugs for functional safety (C++26 contracts will help here); we’re first addressing the most valuable target to get to parity with other modern languages and then will continue to do more.

Importantly, this approach to hardening C++ doesn’t change C++’s value proposition — it keeps C++ still C++, it doesn’t try to turn C++ into “something else” such as by requiring mandatory performance overheads. All of the above embrace C++’s existing source code and its “zero-overhead, don’t pay for it if you don’t use it” core values, and just make it convenient to make memory safety the default — always with an opt-out, so that full performance and control is always available when you want to let Tiamat and Cthulhu use their powers in your service, under your control and for good.

And it’s designed to be super adoptable to bring existing code forward:

Many of the improvements are adoptable without any code changes (really!) — just recompile your existing project with a C++26 compiler, and your code will be safer. This is important because when you write code you write bugs, and even when you write code to fix bugs you write new bugs; this is part of the cost of requiring code changes that we’d like to minimize.Even when you opt into a profile language subset that rejects unsafe code by default, you can still opt back out to writing the unsafe thing with an explicit, greppable, and auditable “suppress safety rule here” annotation (similar to “unsafe” in other languages).

That’s it — if you stop reading here, you have the full story.

But I think the details are pretty interesting, so join me if you like as we dive further into the above points (1), (2), and (3)…

(1) Since 2011: constexpr code

Starting in C++11, C++’s compile-time constexpr world has already become a sandbox free from undefined behavior, quietly revolutionizing C++ by enabling powerful compile-time computation while also ensuring safety. During constexpr evaluation, the language mandates well-defined behavior — no wild pointers, no uninitialized reads, no surprises. If an operation might trigger undefined behavior, the compiler simply rejects the constexpr evaluation at compile time. This guarantees correctness before execution time, empowering developers to write faster, safer, and more expressive code.

Every release of C++ has continued making more of the language and standard library available in compile-time constexpr code, so that as of C++26 nearly the entire language and much of the standard library is available in constexpr code.

This is modern C++ at its best: unleashing compile-time power while also enforcing its correctness.

This is in production use, not vaporware: All major compilers have supported UB-free constexpr compile-time code for over a decade and it’s in widespread production use. Probably almost every nontrivial C++ project today is already using at least some UB-free constexpr code, unless it is very old code compiled with a very old compiler.

(2) Since 2024: Language safety and software security improvements adopted for C++26

Over the past year, C++26 has made further solid progress on language safety and software security. Briefly, here’s what C++26 has already adopted (some of this material is repeated from my previous trip reports; see the links for much more detail and discussion):

In March 2024 (see my March 2024 trip report), draft C++26 eliminated UB for uninitialized variables by turning it instead into a new kind of behavior: erroneous behavior (aka EB) that is still considered “wrong code” (so compilers should still warn about it) but is now well-defined so it is no longer UB-dragon-bait even if your code does transgress. That eliminates one root cause of a serious class of security vulnerabilities.Last month (see my February 2025 trip report), draft C++26 additionally added a specification for a hardened standard library. Just recompiling with a hardened library gives our programs bounds safety guarantees for many common non-iterator C++26 standard library operations, including common operations on very popular standard types: string, string_view, span, mdspan, vector, array, optional, expected, bitset, and vararray. (At the same meeting, we also adopted language contracts to help improve functional safety for defensive programming to reduce bugs in general.)

Importantly, both of these achieve the holy grail of adoptability: “Just recompile all your existing code with a C++26 compiler / hardened library, and it will be safer.” That’s just an awesome adoption story. If you’ve seen any of my recent talks, you know this is close to my heart… see especially this short clip from my November talk in Poland and also this short clip in the Q&A about the societal value of improving C++. Of course, getting full safety improvements will sometimes require code changes, nobody is saying otherwise — for example, if you write a dangling pointer because your code is confused about ownership then you really will need to go fix and possibly restructure your code. But it’s pretty nice that we can get a subset of the safety improvements even just by recompiling our existing code!

Again, this is in production use, not vaporware: The support for uninitialized variables and the hardened standard library may be new to draft standard C++26, but they are already well supported on existing compilers. For uninitialized variables, you can already use the pre-standard compiler switches -ftrivial-auto-var-init=pattern (GCC, Clang) and /RTC1 (MSVC). For the hardened standard library, as the P3471 authors note, it has already been deployed in major commercial environments (you can use it today in libc++, see documentation here; MS-STL and libstdc++ have some similar options):

“We have experience deploying hardening on Apple platforms in several existing codebases.

Google recently published an article where they describe their experience with deploying this very technology to hundreds of millions of lines of code. They reported a performance impact as low as 0.3% and finding over 1000 bugs, including security-critical ones.

Google Andromeda published an article ~1 year ago about their successful experience enabling hardening.

The libc++ maintainers have received numerous informal reports of hardening being turned on and helping find bugs in codebases.

Overall, standard library hardening has been a huge success, in fact we never expected so much success. The reception has been overwhelmingly positive …”

This really demonstrates the value of addressing low-hanging fruit, and the Pareto principle (aka 80/20 rule): Often 80% of the benefit comes from the first 20% of investment.

(3) Since the past month: More work ongoing in the C++26 timeframe

For about a year now, multiple C++ committee experts have independently proposed systematically cataloging and/or addressing UB in C++:

December 2023: Shafik Yaghmour’s proposal P3075R0 to catalog C++’s language UB and document it as an Annex to the standard. (Building on his earlier pre-pandemic paper P1705R1.) This was encouraged by the core language specification subgroup (aka CWG) at the March 2024 meeting.October 2024: My proposal P3436R0 to catalog UB and systematically address it using the opt-in mechanism of Bjarne Stroustrup and Gabriel Dos Reis’ language profiles proposal which has the ability to designate profiles as “named groups” of related compile-time restrictions and run-time checks that are easy to opt into to make safety the default. For more details, see my November 2024 trip report. This was unanimously encouraged by the Safety and Security subgroup (aka SG23) at the November 2024 meeting.October 2024: Timur Doumler, Gašper Ažman, and Joshua Berne’s proposal P3100R1 to catalog UB and systematically address it as contract violations, using the new C++26 contract_assert feature to perform run time checks also for problematic language features. There is a related proposal P3400 to designate contract labels as “named groups” of related run-time checks that are easy to opt into to make safety the default. P3100 was unanimously encouraged by the Contracts subgroup (aka SG21) at the November 2024 meeting.

You can see the pattern: there are proposers and volunteers to

systematically catalog language UB,specify a way to eliminate the UB (make it illegal, or well-defined including where necessary with a run-time check such as a bounds check),make that elimination happen preferably all the time where it’s efficient enough (as C++26 is doing for uninitialized local variables) or else under a named group that’s easy to opt into (profile name, or contract label name), andrealizing that different UB cases need to be addressed in different ways, and we’re willing to put in the effort… no magic wand, Just Engineering.

At our February 2025 meeting, the main subgroup responsible for all language evolution (aka EWG) took these suggestions and gathered them together, and the group approved a mandate to pursue

“… a language safety white paper in the C++26 timeframe containing
systematic treatment of core language Undefined Behavior in C++,
covering Erroneous Behavior, Profiles, and Contracts.”

Note that this is separate from C++26, because C++26 is now undergoing feature freeze and will spend the next year doing comment review and fit-and-finish, so we cannot now add new material (such as UB mitigations) to C++26 itself. But we want to keep our momentum and not let this important work wait for C++29, so concurrently with C++26 “in the C++26 timeframe” we intend to work on a white paper to catalog and address C++ language UB, that we hope to publish around the same time as C++26 is published.

Note: A white paper is an ISO publication that’s a flavor of Technical Specification (TS); think of a white paper or TS as a “feature branch.” The C++ committee has already published a dozen TSes since 2012, such as the concepts and modules TSes, most of which have already been merged into the “trunk” international standard (aka IS). A white paper and TS use the same process within the C++ committee, but a white paper just has less ISO red tape at the end compared to a TS so it can be published faster.

So now and over the next year or two, we’re undertaking to systematically catalog cases of UB in the C++ language to put a visible label on each fang and tentacle. Then, starting with the most important high-value targets, start deciding whether and how to address each in the most appropriate way but likely using those three tools mentioned in the mandate:

C++26 erroneous behavior, which you’ll recall the draft C++26 standard is already using to deal with uninitialized local variables.Bjarne Stroustrup’s profiles and Gabriel Dos Reis’ P3589 profiles framework which allow us to create named groups of rules and checks, so that program code can easily opt into full safety by default and tactically opt out again where needed. Efforts now underway are focusing on implementation and deployment of the profiles framework and a few key profiles for experimentation across the C++ ecosystem.C++26 contract assertions to check language features, as extended with P3400 labels which allow us to create named groups of checks.

I won’t lie: This is going to be a metric ton of work. And it’s work that I think some people don’t expect C++ to ever be able to do. But I think that it is achievable, and that it will be worth it, and we appreciate and want to thank all the committee members who have already expressed interest in volunteering to help — thank you!

New a week ago: P3656 strongly encouraged

Gašper Ažman and I got appointed to try to organize the work. So to get this started, Gašper and I wrote paper P3656 to detail a proposed procedure and plan. On March 19, EWG reviewed this in a telecon and voted strong encouragement that

“P3656 is ‘on the right track’ with the strategy proposed for
producing a white-paper for ‘Core Language UB (and IF-NDR).’”

So here’s a quick overview of what we aim to do over the coming year or two, in the same timeframe as C++26…

First, list cases: Enumerate language UB

The goal of this part is to tag every case of language UB directly in the standard’s LaTeX sources, with at least a short description and code example. Using LaTeX tags right in the standard’s sources will let us automatically build another Annex to list the UB in one place, as the standard already does for the grammar for example. Additional detailed discussion and selected mitigations will go into the white paper.

We will also likely tag some basic attributes of each UB, such as:

have security experts tag whether it is directly exploitable, so that we can prioritize security-critical low-hanging fruit first; andtag whether it is cheap to check locally with information already available (such as null pointer dereference which is easy to check locally with ptr != nullptr) or requires more information (such as other-than-null dangling pointer dereference which is more challenging, and some UB may be too expensive to entirely remove).

This also creates backpressure to reduce adding future UB, by requiring discussion and documentation in this list for any new UB proposals.

Second, list tools: Create a “non-exhaustive starter menu of tools”

The idea is to make an initial list of the tool(s) we can apply to each case of UB.

The EWG mandate already included erroneous behavior (EB), profiles, and contracts as the primary expected tools, so a slightly more detailed candidate list might be:

make the UB well-defined (just fix it always, no opt-in required; this could be a run-time check);make the UB fail to compile (e.g., make it ill-formed which could change the meaning of SFINAE code that could use a different fallback path to avoid the UB path, or make it directly rejected without changing any meaning), either always or when a profile/label is enforced;make the UB deprecated, either always or when a profile/label is enforced; and/ormake the UB be EB instead, either always (as we did for uninitialized locals) or when a profile/label is enforced.

This list is not exhaustive; we may find UB we want to handle using another technique, but I expect most cases of UB can be handled well using these tools.

We also intend to write some initial guidelines, for EWG to review and approve, about when to use each tool, including performance considerations, adoption hurdles (like frequency of that UB, or consequences of crashes), and other common considerations.

Third, apply: For each case of UB, say how we plan to address it

In many cases, this will require thoughtful papers, including strong implementation experience when there is a risk that performance or deployability may be difficult. My expectation is that we will find groups of similar UB that can all be handled in one paper, but the point is we want to be methodical about this… we aim to move fast, but the primary goal here is to make sure we actually unbreak things.

Fourth, group: Group UB cases into cohesive groups (profiles names / contract labels)

Finally, we can identify cohesive groups of UB that programs will want to address together, which makes them easy to opt into as a unit; for example, a “bounds_safety” group could include all bounds safety-related UB. These groups can overlap; for example, the same UB fix might be selectable as part of a “bounds_safety” group and as part of a general larger “strict_cplusplus” group.

New a few days ago: Efforts in progress to lock down the specific UB that malicious code relies on

Relatedly, a very interesting proposal was brought to the February ISO C++ meeting by Úlfar Erlingsson, Google’s DE for Cloud Security, P3627R0 (slides): “Easy-to-adopt security profiles for preventing RCE (remote code execution) in existing C++ code.”

Summarizing Úlfar’s premise:

We have already developed sufficient hardening implementation technology in modern compilers to effectively harden existing C++ code without code changes — not by aiming for language memory safety guarantees broadly, but by surgically targeting key UB that makes remote code execution (RCE) possible. Specifically: Stack integrity, control-flow integrity (CFI), heap data integrity, and pointer integrity and unforgeability. (Note: Úlfar was the first to efficiently implement stack integrity with strong guarantees, working with George Necula who originally designed it in CCured; and he and collaborators were the first to propose and implement CFI.)If we do nothing more than take away the UB that can be used as building blocks for RCE (even if we still allowed other corruption), then bad actors would lose most of the tools they use to gain control over execution and run their malware, and we would dramatically harden the world’s code.A key problem is that right now these technologies exist as separate features when the real benefit comes from enabling them together, and so we should standardize a profile that lets programmers tell their compilers to activate them together.

On Thursday, Úlfar published a new paper elaborating these ideas: “How to Secure Existing C and C++ Software without Memory Safety” describes how these techniques could not only prevent most RCE but also generally retake control of execution away from the attackers.

It’s well worth reading. An updated paper proposing this material for C++ standardization is expected soon in the C++ committee. As Úlfar notes (emphasis added): “This is a big change and will require a team effort: Researchers and standards bodies need to work together to define a set of protection profiles that can be applied to secure existing software — without new risks or difficulties — easily, at the flip of a flag …”

Note: A related new publication updated a week ago is the OpenSSF “Compiler Options Hardening Guide for C and C++.” This is a useful guide to existing security options that are good to know about and can be used in today’s compilers. These options add a variety of warnings and mechanisms that will help with security, including some used in Úlfar’s proposal (CFI and address space layout randomization, aka ASLR). However, these options are all “best effort,” and do not promise any guarantees, even when used all together — including options needing source code changes and those with noticeable overhead. What makes Úlfar’s approach different is that it carefully selects four specific techniques designed to reinforce each other such that they establish guarantees about the nested execution of functions, and the use of heap objects and pointers. Those guarantees eliminate almost all of the specific UB that malware authors rely on, and will hold even when the remaining UB is triggered, e.g., to corrupt memory.

If the language UB white paper could achieve not only its first goal of a broad systematic cataloging and mitigation of UB (grouped into profile/label names that programmers can turn on), but also specifically a “controlled_execution_security” profile that eliminates nearly all remote code execution attacks, that would be a great outcome — and would dramatically reduce C++ software security vulnerability exposure to parity (equality) with the other modern languages.

Summary, and what’s next

As a wise sage once said: “If you choose not to decide, you still have made a choice.”

For many years, software security may not have seemed pressing enough for C++ standardization broadly to make it a top priority, though gradual improvement has always continually occurred. But times have changed; we have been confronted with a spike of cyberattacks and cyberwar that creates serious threats to the systems we rely on to sustain our civilization, and faced stark choices: react decisively? and how? or not? Making a choice was not optional, as the sage pointed out.

We have chosen: to focus on improving C++ language safety as a priority, with the goal of achieving parity (as measured in number of security vulnerabilities) with other modern languages.

We have already accomplished a great deal. Compile-time C++ is already fully free of UB, which means a huge chunk of real-world C++ is already UB-free today. In C++26 we’re already eliminating several frequent vulnerability UB root causes, where in the language uninitialized variables are no longer UB and in the standard library many common operations on widely used types like vector and string and span and string_view are becoming bounds-safe in a C++26 hardened implementation. Although these are new to the standard, all have been deployed at scale in the field, and making them standard will make them easier to adopt even more widely. (In C++26, we are also shipping language contracts for a different aspect of safety, namely functional safety for defensive programming to reduce bugs in general.)

It’s working: The price of zero-day exploits has already increased. Now we have a path to get the rest of the way to taming UB in C++. Yes, there’s still a great deal of work ahead, but if we can make a solid push over the next one to two years we do have a real shot at systematically addressing UB in C++, including eliminating nearly all remote code execution attacks. If these efforts to cage the monsters works out even half as well as we hope, I think a lot of folks are going to be very (and I think happily) surprised.

As several other wise sages said: “Let the good times roll.”

If you’re one of the ones helping with either what’s been accomplished already and/or with our next steps above, we want to again say a big “thank you!” — your help is appreciated, and it really matters.

Thanks, very much.

Appendix: UB, briefly

Historically, UB was allowed in C and then C++ as the basis for compiler optimizations: Compilers are allowed to assume that UB never happens and optimize your program based on that assumption. In the real world, compilers are variously aggressive about making that assumption; for a survey of what common examples the different major compilers actually do optimize in what ways at different optimization levels, see my 2020 paper P2064R0 section 3.4.

We have now been reconsidering UB for two reasons, which to me corresponds to that the UB dragon has multiple heads:

UB often has directly safety and security implications. For example, if a program sometimes tries to access out-of-bounds memory, a malicious actor can use that vulnerability to write an exploit that will install malware to steal cryptocurrency or worse.UB also has indirect safety and security implications. For example, if the compiler encounters an if/else branch and notices that one side of the branch would always encounter UB, it can not only assume that branch is never taken, but it can also assume that the condition the branch is testing is always true (or always false) and so not even test it — which is problematic if the branch was doing a deliberately-enabled safety-related contract check that the compiler ends up silently optimizing out of the compiled program so that the check is never performed at all.UB optimizations also just create mysterious ordinary bugs, such as variables that appear to be simultaneously true and false, unreachable code that gets executed anyway, and “time travel” optimizations that change code that precedes the point where the UB can happen (hence, the idea of UB ‘reaching back to modify the past’).

Less of all those things, please. Over the past decade C++ has been pursuing ways to keep all our glorious optimizations but to specify the optimizability in ways other than fire-breathing mind-flaying UB.

Notes:

Addressing UB in C++ is easier to do than in C, because C is a fine language but is lower-level with fewer standard abstractions, which means it has fewer universally available alternatives to recommend and fewer standard library features that the standard can directly harden. UB is closely related to another technical concept in the C++ standard called “ill-formed, no diagnostic required” (IF-NDR). For convenience, herein I’m saying just “UB” as a shorthand for “UB and [or, including] IF-NDR.”

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on March 30, 2025 23:39

February 17, 2025

Trip report: February 2025 ISO C++ standards meeting (Hagenberg, Austria)

On Saturday, the ISO C++ committee completed the second-last design meeting of C++26, held in Hagenberg, Austria. There is just one meeting left before the C++26 feature set is finalized in June 2025 and draft C++26 is sent out for its international comment ballot (aka “Committee Draft” or “CD”), and C++26 is on track to be technically finalized two more meetings after that in early 2026.

This meeting was hosted by the University of Applied Sciences of Upper Austria, RISC Software GmbH, Softwarepark Hagenberg Upper Austria, Dynatrace, and Count It Group. Our hosts arranged for high-quality facilities at the University for our six-day meeting from Monday through Saturday. We had over 200 attendees, about two-thirds in-person and the others remote via Zoom, formally representing 31 nations. At each meeting we regularly have new guest attendees who have never attended before, and this time there were 36 new first-time guest attendees, mostly in-person, in addition to new attendees who are official national body representatives. To all of them, once again welcome!

The committee currently has 23 active subgroups, 13 of which met in 7 parallel tracks throughout the week. Some groups ran all week, and others ran for a few days or a part of a day, depending on their workloads. We also had combined informational evening sessions to inform the committee broadly about progress on one key topic this week: reflection wording review, and concurrent queues. You can find a brief summary of ISO procedures here.

Highlights

This time, the committee adopted the next set of features for C++26, and made significant progress on other features that are now expected to be complete in time for C+26.

In addition to features already approved for C++26 at previous meetings, at this meeting three major features made strong progress. In the core language:

P2900 Contracts was adopted for C++26P2786 Trivial Relocatability was adopted for C++26P1967 #embed was adopted for C++26

In the standard library:

P3471 Standard Library Hardening (which is also the first use of contracts) was adopted for C++26P0447 std::hive was adopted for C++26

Other noteworthy progress:

P2996 Reflection is almost done its specification wording review aiming for C++26, and is expected to come up for vote for inclusion in C++26 at the June meetingLanguage safety and software security improvements adopted for C++26

C++26 adopted two major features that improve language/library safety and software security: contracts, and a security-hardened standard library that has already delivered actual important security improvements just by recompiling/relinking existing C++ code (see references below).

Note: For definitions of “language safety” and “software security” and similar terms, see my 2024 essay “C++ safety, in context.”

C++26 Contracts

First, years in the making, we adopted P2900R14 “Contracts for C++” by Joshua Berne, Timur Doumler, and Andrzej Krzemieński, with Gašper Ažman, Peter Bindels, Louis Dionne, Tom Honermann, Lori Hughes, John Lakos, Lisa Lippincott, Jens Maurer, Ryan McDougall, Jason Merrill, Oliver J. Rosten, Iain Sandoe, and Ville Voutilainen. This is a huge paper (119 pages) that adds preconditions, postconditions, and contract_assert (a major improvement that brings C’s “assert” macro into the language, with improvements). For an overview, see Timur Doumler’s blog post “Contracts for C++ explained in 5 minutes.” The main change last week is that the committee decided to postpone supporting contracts on virtual functions; work will continue on that and other extensions. Thanks to the coauthors, and to everyone in Study Group 21 (SG21) and the language evolution working group (EWG) and everyone who commented and gave feedback, for their hard work on this feature for many years!

Note: This is the second time contracts has been voted into draft standard C++. It was briefly part of draft C++20, but was then removed for further work.

Relatedly, P1494R4 “Partial program correctness” by Davis Herring adds the idea of “observable checkpoints” that limit the ability of undefined behavior to perform time-travel optimizations. This helps to eliminate some optimization pitfalls that range from not actually executing an enforced contract (see contracts, above) to security vulnerabilities. The paper also provides std::observable() as a manual way of adding such a checkpoint in code.

C++26 hardened standard library

The second is another big step for language and library safety in C++26. Recall that:

C++23 already eliminated returning a dangling reference to a local object at compile time (did you know that? try it now on Godbolt), which has already removed one common source of silent dangling.C++26 has already eliminated undefined behavior from uninitialized variables.And now in C++26…

… P3471R4 “Standard library hardening” by Konstantin Varlamov and Louis Dionne provides initial portable, cross-platform security guarantees for the C++ standard library too as part of C++26. It turns some common and frequently-exploited instances of undefined behavior in the C++ standard library into a contract violation (note: that makes it also the first user of the just-adopted contracts language feature, above! yes, WG21 does try to coordinate its delivered features).

In particular, only two (2) programming language weaknesses made the top 15 most dangerous software weaknesses in the MITRE 2024 CWE Top 25 Most Dangerous Software Weaknesses — and those two are Out-of-Bounds Write (#2) and Out-of-Bounds Read (#6).

This is why I keep repeating that, yes, we need to improve C (especially) and C++ memory safety, but that is far from the only thing we as an industry need to do. As we harden one area, attackers just shift to the next slowest animal in the herd. Already, above, we are seeing more and more of the MITRE Top 25 be not programming language memory safety issues… in 2024, it’s down to just 2 of the top 15. Sure we need to fix those 2 issues and others like them, but let’s never forget that we need to fix the other 13 of the top 15 too! For more on this, again please see my “C++ safety, in context” essay.

This continues to demonstrate what I’ve explained many times: Bounds safety is the lowest-hanging fruit in terms of things we need to address first. That’s why it’s a big deal that, as of Saturday, the C++26 hardened standard library focuses on bounds safety and requires that the all of following are guaranteed to be bounds-checked:

In std::span: operator[], front, back, first, last, subspan, and constructorsIn the std::string_views: operator[], front, back, remove_prefix, remove_suffixIn all sequence containers (e.g., std::vector, std::array): operator[], front, back, pop_front, pop_backIn the std::strings: operator[], front, back, pop_backIn the multidimensional std::mdspan: operator[], and constructorsIn std::bitset: operator[]In std::valarray: operator[]In std::optional: operator->, operator*In std::expected: operator->, operator*, error

And that’s just a start, that has already made a real impact on hardening production code including on popular platforms.

Importantly, user code gets this benefit just by building with a hardened C++26 standard library — without any code changes. If you’ve seen any of my recent talks, you know this is close to my heart… see especially

this short clip in my code::dive 2024 talk about why C++26 removing undefined behavior for uninitialized locals is a model for adoptability and also this short clip in the Q&A about the societal value of improving C++.

I think that Konstantin and Louis express that value proposition beautifully in their paper’s Motivation section, and I’ll quote most of their appeal here (emphasis original)… they “get it”:

“There has been significantly increased attention to safety and security in C++ over the last few years, as exemplified by the well-known White House report and numerous recent security-related proposals.

“While it is important to explore ways to make new code safer, we believe that the highest priority to deliver immediate real-world value should be to make existing code safer with minimal or no effort on behalf of users. Indeed, the amount of existing security-critical C++ code is so large that rewriting it or modifying it is both economically unviable and dangerous given the risk of introducing new issues.

“There have been a few proposals accepted recently that eliminate some cases of undefined behavior in the core language. The standard library also contains many instances of undefined behavior, some of which is a direct source of security vulnerabilities; addressing those is often trivial, can be done with low overhead and almost no work on behalf of users.

“In fact, at the moment all three major library implementations have some notion of a hardened or debug mode. This clearly shows interest, both from users and from implementers, in having a safer mode for the standard library. However, we believe these efforts would be vastly more useful if they were standardized and provided portable, cross-platform guarantees to users; as it stands, implementations differ in levels of coverage, performance guarantees and ways to enable the safer mode.

“Finally, leaving security of the library to be a pure vendor extension fails to position ISO C++ as providing a credible solution for code bases with formal security requirements. We believe that formally requiring the basic safety guarantees that most implementations already provide in one way or another could make a significant difference from the point of view of anyone writing or following safety and security coding standards and guidelines.”

In the next section, they demonstrate that this isn’t some theoretical improvement — it’s an improvement that is standardizing what is already shipping and significantly hardening existing C++ code today (emphasis added):

“All three major implementations provide vendor-specific ways of enabling library assertions as proposed in this paper, today.

We have experience deploying hardening on Apple platforms in several existing codebases. Google recently published an article where they describe their experience with deploying this very technology to hundreds of millions of lines of code. They reported a performance impact as low as 0.3% and finding over 1000 bugs, including security-critical ones. Google Andromeda published an article ~1 year ago about their successful experience enabling hardening. The libc++ maintainers have received numerous informal reports of hardening being turned on and helping find bugs in codebases.
Overall, standard library hardening has been a huge success, in fact we never expected so much success. The reception has been overwhelmingly positive and while the quality of implementation will never be perfect, we are working hard to expand the scope of hardening in libc++, to improve its performance and the user experience.

This further demonstrates that not only is C++ making serious progress to improve, but that many of the language safety and software security improvements are already shipping without waiting for standardization. Standardization is still important, of course, because it makes these improvements available portably, with portable guarantees for C++ code on all platforms.

More things adopted for C++26: Core language changes/features

Note: These links are to the most recent public version of each paper. If a paper was tweaked at the meeting before being approved, the link tracks and will automatically find the updated version as soon as it’s uploaded to the public site.

In addition to fixing a list of defect reports, the core language adopted 8 papers, including contracts (above) and the following…

P2786R13 “Trivial relocatability for C++26” by Alisdair Meredith, Mungo Gill, Joshua Berne, Corentin Jabot, Pablo Halpern, and Lori Hughes adds stronger support for optimizing the copying of memcpy-able types in the C++ language. This removes a source of “undefined behavior” that many container libraries rely on because it happens to be useful and probably-benign, and not only guarantees it is well-defined but also makes the optimizations more widely available for more types. Thank you, Alisdair and your collaborators, and thanks also to the authors of other trivial-relocation proposals that were not adopted! All of the input has made the result better, and we appreciate all the continued feedback.

P1967R14 “#embed – a scannable, tooling-friendly binary resource inclusion mechanism” by JeanHeyd Meneide enables “#include for binary data” — a portable way to pull binary data into a program without external tools and build system support. The introduction is clear and crisp:

“For well over 40 years, people have been trying to plant data into executables for varying reasons. Whether it is to provide a base image with which to flash hardware in a hard reset, icons that get packaged with an application, or scripts that are intrinsically tied to the program at compilation time, there has always been a strong need to couple and ship binary data with an application.

“Neither C nor C++ makes this easy for users to do, resulting in many individuals reaching for utilities such as xxd, writing python scripts, or engaging in highly platform-specific linker calls to set up extern variables pointing at their data. Each of these approaches come with benefits and drawbacks. For example, while working with the linker directly allows injection of very large amounts of data (5 MB and upwards), it does not allow accessing that data at any other point except runtime. Conversely, doing all of these things portably across systems and additionally maintaining the dependencies of all these resources and files in build systems both like and unlike make is a tedious task.

“Thusly, we propose a new preprocessor directive whose sole purpose is to be #include, but for binary data: #embed.”

Note that this feature has already been approved for inclusion in the next revision of C as well. See the proposal paper, especially sections 3.3 and 4.1, for more delightful background and design alternative discussion.

P2841R7 “Concept and variable-template template-parameters” by Corentin Jabot, Gašper Ažman, James Touton, and Hubert Tong adds the ability of passing concepts and variable templates as template parameters. Thank you, Corentin and your coauthors!

More things adopted for C++26: Standard library changes/features

In addition to fixing a list of defect reports, the standard library adopted 15 papers, including the following…

P0447R28 “Introduction of std::hive to the standard library” by Matthew Bentley is a major library addition that formalizes a widely-used high-performance data structure. From the paper’s overview:

“Hive is a formalisation, extension and optimization of what is typically known as a ‘bucket array’ or ‘object pool’ container in game programming circles. Thanks to all the people who’ve come forward in support of the paper over the years, I know that similar structures exist in various incarnations across many fields including high-performance computing, high performance trading, 3D simulation, physics simulation, robotics, server/client application and particle simulation fields (see this google groups discussion , the hive supporting paper #1 and appendix links to prior art ).

“The concept of a bucket array is: you have multiple memory blocks of elements, and a boolean token for each element which denotes whether or not that element is ‘active’ or ‘erased’ – commonly known as a skipfield. If it is ‘erased’, it is skipped over during iteration. When all elements in a block are erased, the block is removed, so that iteration does not lose performance by having to skip empty blocks. If an insertion occurs when all the blocks are full, a new memory block is allocated.

“The advantages of this structure are as follows: because a skipfield is used, no reallocation of elements is necessary upon erasure. Because the structure uses multiple memory blocks, insertions to a full container also do not trigger reallocations. This means that element memory locations stay stable and iterators stay valid regardless of erasure/insertion. This is highly desirable, for example, in game programming because there are usually multiple elements in different containers which need to reference each other during gameplay, and elements are being inserted or erased in real time. The only non-associative standard library container which also has this feature is std::list, but it is undesirable for performance and memory-usage reasons. This does not stop it being used in many open-source projects due to this feature and its splice operations.

“Problematic aspects of a typical bucket array are that they tend to have a fixed memory block size, tend to not re-use memory locations from erased elements, and utilize a boolean skipfield. The fixed block size (as opposed to block sizes with a growth factor) and lack of erased-element re-use leads to far more allocations/deallocations than is necessary, and creates memory waste when memory blocks have many erased elements but are not entirely empty. Given that allocation is a costly operation in most operating systems, this becomes important in performance-critical environments. The boolean skipfield makes iteration time complexity at worst O(n) in capacity(), as there is no way of knowing ahead of time how many erased elements occur between any two non-erased elements. This can create variable latency during iteration. It also requires branching code for each skipfield node, which may cause performance issues on processors with deep pipelines and poor branch-prediction failure performance.

“A hive uses a non-boolean method for skipping erased elements, which allows for more-predictable iteration performance than a bucket array and O(1) iteration time complexity; the latter of which means it meets the C++ standard requirements for iterators, which a boolean method doesn’t. It has an (optional – on by default) growth factor for memory blocks and reuses erased element locations upon insertion, which leads to fewer allocations/reallocations. Because it reuses erased element memory space, the exact location of insertion is undefined. Insertion is therefore considered unordered, but the container is sortable. Lastly, because there is no way of predicting in advance where erasures (‘skips’) may occur between non-erased elements, an O(1) time complexity [ ] operator is not possible and thereby the container is bidirectional but not random-access.”

Continuing the “making more things constexpr (and consteval)” drumbeat that allows more and more of the full C++ language and standard library be usable in constexpr code, we approved a set of constexpr extensions:

P3372R3 “constexpr containers and adaptors” by Hana Dusíková makes all containers and adapters constexpr (except for the new std::hive, above). P3378R2 “constexpr exception types” by Hana Dusíková makes all exception types used with constexpr code be constexpr too.

Thanks, Hana! These are just the latest of a continued stream of Hana’s papers for constexpr-ing the C++ world; much appreciated — děkuju and arigato!

You may recall that at our last meeting we merged std::simd by Matthias Kretz for high-throughput parallel/vector programming into draft C++26. On Saturday we approved a set of further extensions and refinements, including:

P3441R2 “Rename simd_split to simd_chunk” by Daniel Towner and Ruslan Arutyunyan not only does what it says on the tin, but also adds new convenience overloads. P2663R7 “Interleaved complex values support in std::simd” by Daniel Towner and Ruslan Arutyunyan adds interleaved complex values support. P2933R4 “Extend header function with overloads for std::simd” by (you guessed it!) Daniel Towner and Ruslan Arutyunyan.

Not to be outdone, P2976R1 “Freestanding library: algorithm, numeric, and random” by Ben Craig continues Ben’s march toward making a huge amount of C++ available on freestanding implementations. Thanks, Ben!

Last but not least, P3019R14 “indirect and polymorphic: Vocabulary types for composite class design” by Jonathan Coe, Antony Peacock, and Sean Parent adds value-semantic types for polymorphic objects to the standard library. This makes polymorphic types much easier to treat as values in value-like algorithms and use cases. Thanks very much, Jonathan and Antony and Sean!

What’s next

Thank you to all the experts who worked all week in all the subgroups to achieve so much this week!

Our next meeting will be this June in Sofia, Bulgaria hosted by Chaos Group and C++ Alliance.

Thank you again to the over 200 experts who attended on-site and on-line at this week’s meeting, and the many more who participate in standardization through their national bodies!

But we’re not slowing down… in case you think C++“26” sounds very far away, it sure isn’t… we’re only one meeting away before the C++26 freeze in June, and even before that compilers are already aggressively implementing C++26, with GCC and Clang having already implemented about two-thirds of C++26’s language features adopted so far! C++ is a living language and moving fast. Thank you again to everyone reading this for your interest and support for C++ and its standardization.

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on February 17, 2025 23:01

January 16, 2025

code::dive 2024 interview video posted

After my code::dive talk in November, the organizers also recorded an extra 9-minute interview that covered these questions:

What role do you think AI will play in shaping programming languages?Do you have any rituals or routines before going on stage?What do you find most exciting about C++?What advice would you give to the code::dive community?

Here it is…

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on January 16, 2025 21:08

New U.S. executive order on cybersecurity

The Biden administration just issued another executive order (EO) on hardening U.S. cybersecurity. This is all great stuff. (*) (**)

A lot of this EO is repeating the same things I urged in my essay nearly a year ago, “C++ safety — in context”… here’s a cut-and-paste of my “Call(s) to action” conclusion section I published back then, and I think you’ll see a heavy overlap with this week’s new EO…

Call(s) to action
As an industry generally, we must make a major improvement in programming language memory safety — and we will.

In C++ specifically, we should first target the four key safety categories that are our perennial empirical attack points (type, bounds, initialization, and lifetime safety), and drive vulnerabilities in these four areas down to the noise for new/updated C++ code — and we can.

But we must also recognize that programming language safety is not a silver bullet to achieve cybersecurity and software safety. It’s one battle (not even the biggest) in a long war: Whenever we harden one part of our systems and make that more expensive to attack, attackers always switch to the next slowest animal in the herd. Many of 2023’s worst data breaches did not involve malware, but were caused by inadequately stored credentials (e.g., Kubernetes Secrets on public GitHub repos), misconfigured servers (e.g., DarkBeam, Kid Security), lack of testing, supply chain vulnerabilities, social engineering, and other problems that are independent of programming languages. Apple’s white paper about 2023’s rise in cybercrime emphasizes improving the handling, not of program code, but of the data: “it’s imperative that organizations consider limiting the amount of personal data they store in readable format while making a greater effort to protect the sensitive consumer data that they do store [including by using] end-to-end [E2E] encryption.”

No matter what programming language we use, security hygiene is essential:

Do use your language’s static analyzers and sanitizers. Never pretend using static analyzers and sanitizers is unnecessary “because I’m using a safe language.” If you’re using C++, Go, or Rust, then use those languages’ supported analyzers and sanitizers. If you’re a manager, don’t allow your product to be shipped without using these tools. (Again: This doesn’t mean running all sanitizers all the time; some sanitizers conflict and so can’t be used at the same time, some are expensive and so should be used periodically, and some should be run only in testing and never in production including because their presence can create new security vulnerabilities.)Do keep all your tools updated. Regular patching is not just for iOS and Windows, but also for your compilers, libraries, and IDEs.Do secure your software supply chain. Do use package management for library dependencies. Do track a software bill of materials for your projects.Don’t store secrets in code. (Or, for goodness’ sake, on GitHub!)Do configure your servers correctly, especially public Internet-facing ones. (Turn authentication on! Change the default password!)Do keep non-public data encrypted, both when at rest (on disk) and when in motion (ideally E2E… and oppose proposed legislation that tries to neuter E2E encryption with ‘backdoors only good guys will use’ because there’s no such thing).Do keep investing long-term in keeping your threat modeling current, so that you can stay adaptive as your adversaries keep trying different attack methods.
We need to improve software security and software safety across the industry, especially by improving programming language safety in C and C++, and in C++ a 98% improvement in the four most common problem areas is achievable in the medium term. But if we focus on programming language safety alone, we may find ourselves fighting yesterday’s war and missing larger past and future security dangers that affect software written in any language.

Sadly, there are too many bad actors. For the foreseeable future, our software and data will continue to be under attack, written in any language and stored anywhere. But we can defend our programs and systems, and we will.

(*) My main disappointment is that some of the provisions have deadlines that are too far away. Specifically: Why would it take until 2030 to migrate to TLS 1.3? It’s not just more secure, it’s also faster and has been published for seven years already… maybe I’m just not aware enough of TLS 1.3 adoptability issues though, as I’m not a TLS expert.

(**) Here in the United States, we’ll have to see whether the incoming administration will continue this EO, or amend/replace/countermand it. In the United States, that’s a drawback of using an EO compared to passing an actual law with Congressional approval… an EO is “quick” because the President can issue it without getting legislative approval (for things that are in the Presidential remit), but for the same reason an EO also isn’t “durable” or guaranteed to outlive its administration. Because the next President can just order something different, an EO’s default shelf life is just 1-4 years.

So far, all the major U.S. cybersecurity EOs that could affect C++ have been issued since 2021, which means so far they have all come from one President… and so we’re all going to learn a lot this year, one way or another, about their permanence. (In both the U.S. and the E.U., actual laws are also in progress to shift software liability from consumer to software producers, and those will have real teeth. But here we’re talking about the U.S. EOs from 2021 to date.)

That said, what I see in these EOs is common sense pragmatism that’s forcing the software industry to eat our vegetables, so I’m cautiously optimistic that we’ll continue to maintain something like these EOs and build on them further as we continue to work hard to secure the infrastructure that our comfortable free lifestyle (and, possibly someday, our lives) depends on. This isn’t about whether we love a given programming language, it’s about how we can achieve the greatest hardening at the greatest possible scale for our civilization’s infrastructure, and for those of us whose remit includes the C++ language that means doing everything we can to harden as much of the existing C and C++ code out there as possible — all the programmers in the world can only write so much new/rewritten code every year, and for us in C++ by far the maximum contribution we can make to overall security issues related to programming languages (i.e., the subset of security issues that fall into our remit) is to find ways to improve existing C and C++ code with no manual source code changes — that won’t always be possible, but where it’s possible it will maximize our effectiveness in improving security at enormous scale. See also this 2-minute answer I gave in post-talk Q&A in Poland two months ago.

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on January 16, 2025 20:51

January 7, 2025

Speaking at University of Waterloo on January 15

Next week, on January 15, I’ll be speaking at the University of Waterloo, my alma mater. There’ll be a tech talk on key developments in C++ and why I think the language’s future over the next decade will be exciting, with lots of time allocated to a “fireside chat / interview” session for Q&A. The session is hosted by Waterloo’s Women in Computer Science (WiCS) group, and dinner and swag by Citadel Securities, where I work.

This talk is open to Waterloo students only (registration required). The organizers are arranging an option to watch remotely for the half of you who are away from campus on your co-op work terms right now — I vividly remember those! Co-op is a great experience.

I look forward to meeting many current students next week, and comparing notes about co-op work terms, pink ties (I still have mine) and MathSoc and C&D food (if Math is your faculty), WATSFIC, and “Water∞loo” jokes (I realize doing this in January is tempting the weather/travel gods, but I do know how to drive in snow…).

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on January 07, 2025 10:55

January 2, 2025

Speaking at New York C++ meetup on January 13

Less than two weeks from now, on January 13 I’ll be speaking at the New York C++ meetup in Midtown East (Clinton Hall at 230 E 51st Street). I’ll be giving a condensed update of my recent “Peering forward: C++’s next decade” talk, so that there’ll be plenty of time for Q&A — please have your questions ready about all the cool things happening right now in the ISO C++ world.

The meetup is sponsored by Citadel Securities, where I work. Food and drink will be served! Registration is free, but preregistration RSVP is required so please use the form at the link about to reserve your spot. A waitlist is available if space runs out.

I hope to see many of you there!

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on January 02, 2025 15:17

My little New Year’s Week project (and maybe one for you?)

Here is my little New Year’s Week project: Trying to write a small library to enable compiler support for automatic raw union member access checking.

The problem, and what’s needed

During 2024, I started thinking: What would it take to make union accesses type-checked? Obviously, the ideal is to change naked union types to something safe.(*) But because it will take time and effort for the world to adopt any solution that requires making source code changes, I wondered how much of the safety we might be able to get, at what overhead cost, just by recompiling existing code in a way that instruments ordinary union objects?

Note: I describe this in my C++26 Profiles proposal, P3081R0 section 3.7. The following experiment is trying to validate/invalidate the hypothesis that this can be done efficiently enough to warrant including in an ISO C++ opt-in type safety profile. Also, I’m sure this has been tried before; if you know of a recent (last 10 years?) similar attempt that measured its results, please share it in the comments.

What do we need? Obviously, an extra discriminator field to track the currently active member of each union object. But we can’t just add a discriminator field intrusively inside each union object, because that would change the size and layout of the object and completely break link/ABI compatibility. So we have to store it… extrinsically? … as-if in a global map…?

But that sounds stupid scary: global thread safety lions, data locality tigers, and even some branches bears, O my! Could such extrinsic storage and additional checking possibly be efficient enough?

My little experiment

I didn’t know, so earlier this year I wrote some code to find out, and this week I cleaned it up and it’s now posted here:

https://github.com/hsutter/cppfront/tree/main/experimental

The workhorse is extrinsic_storage, a fast and scalable lock-free data structure to nonintrusively store additional Data for each pointer key. It’s wait-free for nearly all operations (not just lock-free!), and I’ve never written memory_order_relaxed this often in my life. It’s designed to be cache- and prefetcher-friendly, such as using SOA to store keys separately so that default hash buckets contain 4 contiguous cache lines of keys.

If you’re looking for a little New Year’s experiment…

If you’re looking for a little project over the next few days to start off the year, may I suggest one of these:

Little Project Suggestion #1: Find a bug or improvement in my little lock-free data structure! I’d be happy to learn how to make it better, fire away! Extra points for showing how to fix the bug or make it run better, such as in a PR or your cloned repo.

Little Project Suggestion #2: Minimally extend a C++ compiler (Clang and GCC are open source) as described below, so that every construction/access/destruction of a union type injects a call to my little library’s union_registry<>:: functions which will automatically flag type-unsafe accesses. If you try this, please let me know in the comments what happens when you use the modified compiler on some real world source! I’m curious whether you find true positive union violations in the union-violations.log file – of course it will also contain false positives, because real code does sometimes use unions to do type punning on purpose, but you should be able to eliminate batches of those at a time by their similar text in the log file.

To make #2 easier, here’s a simple API I’ve provided as union_registry<>, which wraps the above in a compiler-intgration-targeted API. I’ll paste the comment documentation here:

// For an object U of union type, when inject a call to this (zero-based alternative #s)// U is created initialized on_set_alternative(&U,0) = the first alternative# is active// U is created uninitialized on_set_alternative(&U,invalid)// U.A = xxx (alt #A is assigned to) on_set_alternative(&U,A)// U.A (alt #A is otherwise mentioned) on_get_alternative(&U,A)// U is destroyed / goes out of scope on_destroy(&U)//// That's it. Here's an example:// {// union Test { int a; double b; };// Test t = {42}; union_registry<>::on_set_alternative(&u,0);// std::cout << t.a; union_registry<>::on_get_alternative(&u,0);// t.b = 3.14159; union_registry<>::on_set_alternative(&u,1);// std::cout << t.b; union_registry<>::on_get_alternative(&u,1);// } union_registry<>::on_destroy(&u);//// For all unions with under 256 alternatives, use union_registry<>// For all unions with between 256 and 16k alternatives, use union_registry// If you find a union with >16k alternatives, email me the story and use union_registryRough initial microbenchmark performance

My test environment:

CPU: 2.60 GHz i9-13900H (14 physical cores, 20 logical cores)OSes: Windows 11, running MSVC natively and GCC and Clang via Fedora in WSL2

My test harness provided here:

14 test runs: Each successively uses { 1, 2, 4, 8, 16 32, 64, 1, 2, 4, 8, 16, 32, 64 } threadsEach run tests 1 million union objects, 10,000 at a time, 10 operations on each union; the test type is union Union { char alt0; int alt1; long double alt2; };Each run injects 1 deliberate “type error” failure to trigger detection, which results in a line of text written to union-violations.log that records the bad union access including the source line that committed it (so there’s a little file I/O here too)Totals:14 million union objects created/destroyed140 million union object accesses (10 per object, includes construct/set/get/destroy)

On my machine, here is total the run-time overhead (“total checked” time using this checking, minus “total raw” time using only ordinary raw union access), for a typical run of the whole 140M unit accesses:

Compilertotal raw (ms)total checked (ms)total overhead (ms)NotesMSVC 19.40 -O2~190~1020~830-Ox checked was the same or very slightly slower; -Os checked was 3x slowerGCC 14 -O3~170~600-800~430-630-O2 overall was slightly slower; note higher variabilityClang 18 -O3~170~510~340-O2 checked was about 40% slower

Dividing that by 140 million accesses, the per-access overhead is:

Compilertotal overhead (ns) / total accessesaverage overhead / access (ns)MSVC830M ns / 140M accesses5.9 ns / accessGCC (midpoint)530M ns / 140M accesses3.8 ns / accessClang340M ns / 140M accesses2.4 ns / access

Finally, recall we’re running on a 2.6 GHhz processor = 2.6 clock cycles per ns, so in CPU clock cycles the per-access overhead is:

Compileraverage overhead / access (cycles)MSVC15 cyclesGCC9.9 cyclesClang6.2 cycles

This… seems too good to be true. I may well be making a silly error (or several) but I’ll post anyway so we can all have fun correcting them! Maybe there’s a silly bug in my code, or I moved a decimal point, or I converted units wrong, but I invite everyone to have fun pointing out the flaw(s) in my New Year’s Day code and/or math – please fire away in the comments.

Elaborating on why this seems too good to be true: Recall that one “access” means to check the global hash table to create/find/destroy the union object’s discriminator tag (using std::atomics liberally) and then also set or check either the tag (if setting or using one of the union’s members) and/or the key (if constructing or destroying the union object). But even a single L2 cache access is usually around 10-14 cycles! This would mean this microbenchmark is hitting L1 cache almost always, even while iterating over 10,000 active unions at a time, often with more hot threads than there are physical or logical cores pounding on the same global data structure, and occasionally doing a little file I/O to report violations.

Even if I didn’t make any coding/calculation errors, one explanation is that this microbenchmark has great L1 cache locality because the program isn’t doing any other work, and in a real whole program it won’t get to run hot in L1 that often – that’s a valid possibility and concern, and that’s exactly why I’m suggesting Little Project #2, above, if anyone would like to give that little project a try.

In any event, thank you all for all your interest and support for C++ and its evolution and standardization, and I wish all of you and your families a happier and more peaceful 2025!

(*) Today we have std::variant which safely throws an exception if you access the wrong alternative, but variant isn’t as easy to use as union today, and not as type-safe in some ways. For example, the variant members are anonymous so you have to access them by index or by type; and every variant in the program is also anonymous == the same type, so we can’t distinguish/overload unrelated variants that happen to have similar alternatives. I think the ideal answer – and it looks like ISO C++ is just 1-2 years from being powerful enough to do this! – will be something like the safe union metaclass using reflection that I’ve implemented in cppfront, which is as easy to use as union and as safe as variant – see my CppCon 2023 keynote starting at 39:16 for a 4-minute discussion of union vs. variant vs a safe union metafunction that uses reflection.

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on January 02, 2025 00:30

December 11, 2024

My code::dive talk video is available: New Q&A

Two weeks ago, Bjarne and I and lots of ISO committee members had a blast at the code::dive C++ conference held on November 25, just two days after the end of the Wrocław ISO C++ meeting. Thanks again to Nokia for hosting the ISO meeting, and for inviting us all to speak at their conference! My talk was an updated-and-shortened version of my CppCon keynote (which I also gave at Meeting C++; I’ll post a link to that video too once it’s posted):

If you already saw the CppCon talk, you can skip to these “new parts at the end” where the Q&A got into very interesting topics:

44:48 Summary: Why 2024 has turned into a pivotal year for C++ (* transcript at bottom of this post)47:00 Erroneous behavior in C++: What’s the run-time cost and can we opt out when needed?48:00 My new paper proposing taking safety-related undefined behavior and turning it all off by default: I have a dream 51:00 This new ^^ reflection operator: Will it litter our code?54:14 What do you think of evolving/extending C++ vs. directions like cppfront vs. C++ alternatives like Carbon?

Finally, I’m glad I got a chance to give this last answer to cap things off, and thanks again for the audience question that led to it:

57:55 Why I think the most impactful way I can contribute toward improving our society is through improving existing C++ code (** transcript at bottom of this post)

That morning, on our route while traveling from the hotel to the conference site, at one point we noticed that up ahead there was a long line of people all down the length of a block and wrapped around the corner. It took me a few beats to realize that was where we were going, and those were the people still waiting to get in to the conference (at that time there were already over 1,000 people inside the building). Here’s one photo that appeared in the local news showing part of the queue:

In all, I’m told 1,800 people attended on-site, and 8,000 attended online. Thank you again to our Nokia hosts for hosting the ISO C++ meeting and inviting us to code::dive, and thank you to all the C++ developers (and, I’m sure, a few C++-curious) who came from Poland and beyond to spend a day together talking about our favorite programming language!

(*) Here’s a transcript of what I said in that closing summary:

… Reflection and safety improvements as what I see are the two big drivers of our next decade of C++.

So I’m excited about C++. I really think that this was a turning point year, because we’ve been talking about safety for a decade, the Core Guidelines are a decade old, we’ve been talking about reflection for 20 years in the C++ committee — but this is the year that it’s starting to get real. This is the year we put erroneous behavior [in] and eliminated uninitialized locals in the standard, this is the year that we design-approved reflection for the standard — both for C++26 and hopefully they’ll both get in. We are starting to finally see these proposals land, and this is going to create a beautiful new decade, open up a new fresh era of C++. Bjarne [….] when C++11 came out, he said, you know, there’s been so many usability improvements here that C++11, even though it’s fully compatible with C++98, it feels like a new language. I think we’re about to do that again, and to make C++26 feel like a new language. And then just as we built on C++11 and finished it with C++14, 17, 20, the same thing with this generation. That’s how I view it. I’m very hopeful for a bright future for C++. Our language and our community continues to grow, and it’s great to see us addressing the problems we most need to address, so we have an answer for safety, we have an answer for simpler build systems and reducing the number of side languages to make C++ work in practice. And I’m looking forward to the ride for the next decade and more.

And at the end of the Q&A, the final part of my answer about why I’m focused on C++ rather than other efforts:

Why am I spending all this time in ISO C++? Not just because I’m some C++-lover on a fanatical level — you may accuse me of that too — but it’s just because I want to have an impact. I’m a user of this world’s society and civilization. I use this world’s banking system. I rely on this world’s hospital system. I rely on this world’s power grid. And darnit I don’t want that compromised, I want to harden it against attack. And if I put all my energy into some new programming language, I will have some impact, but it’s going to be much smaller because I can only write so much new code. If I can find a way to just recompile — that’s why you keep hearing me say that — to just recompile the billions of lines of C++ code that exist today, and make them even 10% safer, and I hope to make them much more than that safer, I will have had an outsized effect on securing our civilization. And I don’t mean to speak too grandiosely, but look at all the C++ code that needs fixing. If you can find a way to do that, it will have an outsized impact and benefit to society. And that’s why I think it’s important, because C++ is important — and not leaving all that code behind, helping that code too as well as new code, I think is super important, and that’s kind of my motivation.

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on December 11, 2024 08:26

November 24, 2024

Trip report: November 2024 ISO C++ standards meeting (Wrocław, Poland)

On Saturday, the ISO C++ committee completed the third-last design meeting of C++26, held in Wrocław, Poland. There are just two meetings left before the C++26 feature freeze in June 2025, and C++26 is on track to be completed two more meetings after that in early 2026. Implementations are closely tracking draft C++26; GCC and Clang already support about two-thirds of C++26 features right now.

Our host, Nokia, arranged for high-quality facilities for our six-day meeting from Monday through Saturday. We had over 220 attendees, about two-thirds in-person and the others remote via Zoom, formally representing 31 nations. At each meeting we regularly have new attendees who have never attended before, and this time there were 37 new first-time attendees, mostly in-person. To all of them, once again welcome!

The committee currently has 23 active subgroups, 15 of which met in parallel tracks throughout the week. Some groups ran all week, and others ran for a few days or a part of a day, depending on their workloads. We also had two combined informational evening sessions to inform the committee broadly about progress on key topics: one on contracts, the other on relocatability. You can find a brief summary of ISO procedures here.

This time, the committee adopted the next set of features for C++26, and made significant progress on other features that are now expected to be complete in time for C+26.

In addition to features already approved for C++26 at previous meetings, at this meeting three major features made strong progress:

P2996 Reflection is in specification wording review aiming for C++26.P2900 Contracts continues to have a chance of being in C++26 – probably more time was spent on contracts this week in various subgroups than on any other feature.P3081 Safety Profiles and P3471R0 Standard library hardening made progress into the Evolution group and have a chance of being an initial set of profiles for C++26.Adopted for C++26: Core language changes/features

In addition to fixing a list of defect reports, the core language adopted 8 papers, including the following…

P2686R5 “constexpr structured bindings and references to constexpr variables” by Corentin Jabot and Brian Bi. The paper itself contains a great explanation, pasting from page 8:

You can now declare structured bindings constexpr. Because structured bindings behave like references, constexpr structured bindings are subject to similar restrictions as constexpr references, and supporting this feature required relaxing the previous rule that a constexpr reference must bind to a variable with static storage duration. Now, constexpr references and structured bindings may also bind to a variable with automatic storage duration, but only when that variable has an address that is constant relative to the stack frame in which the reference or structured binding lives.

Thanks, Corentin and Brian!

P3068R6 “Allowing exception throwing in constant evaluation” by Hana Dusíková continues the very-welcome and quite-inexorable march toward allowing more and more of C++ to run at compile time. “Compile-time C++, now with exceptions so you can use normal C++ without rewriting your error handling to return codes/std::expecteds!” is the nutshell synopsis. Thank you, Hana!

Adopted for C++26: Standard library changes/features

In addition to fixing a list of defect reports, the standard library adopted 19 papers, including the following…

P3370R1 “Add new library headers from C23” by Jens Maurer is an example of how C++ continues trying to align with C. As the paper states: “C23 added the and headers. This paper [adds] these headers to C++ to increase the subset of code that compiles with C and C++. […] Type-generic macros and type-generic functions do not exist in C++, but function templates can provide the same call interface. Thus, the use of the former in C is replaced by the latter in C++.” Thank you Jens!

(For those who don’t know of Jens, he’s one of the committee’s unsung Energizer Bunnies: He chairs the Core language specification working group all week long at every meeting, he’s the key logistics person who makes every meeting run smoothly from organizing room layouts and A/V to break drinks/snacks, he organizes every meeting’s space allocations for all the subgroups and evening sessions, and after all that he clearly still has time left over to write standard library papers too! We don’t know how he does it all, but the unconfirmed rumor is that there’s a secret clone involved; investigation is still pending.)

Because Hana (already mentioned above) can’t stop adding to constexpr, P3309R3 “constexpr atomic and atomic_ref” by Hana Dusíková does what it says on the tin. If you’re wondering whether this is important (after all, we don’t support threads in constexpr code… yet… and usually atomic is about concurrency), the paper explains:

This paper […] allows implementing other types (std::shared_ptr, persistent data structures with atomic pointers) and algorithms (thread safe data-processing, like scanning data with atomic counter) with just sprinkling constexpr to their specification.

So perhaps a good synopsis would be: “removing the last excuse not to make shared_ptr available in constexpr code!” Thanks, Hana!

P1928R15 “std::simd — merge data-parallel types from the Parallelism TS 2” by Matthias Kretz adopts the data-parallel basic_simd types from the TS into C++26 as std::basic_simd. Two notes: (1) The “changes since the TS” section has a “constexpr everything” section, because that’s just how we roll in WG21 these days (cf: Hana’s papers above). (2) Note the “R” revision number, which indicates the 15 revisions of this proposal that were needed to get it to land — thank you for the determined hard work, Matthias, and once again congratulations from us all: When this proposal was adopted, a sustained round of loud applause filled the room!

Last but not least, P3325R5 “A utility for creating execution environments” by Eric Niebler builds on the huge new std::execution concurrency and parallelism library that was adopted at our previous meeting this summer (see my summer 2024 trip report) to additionally make it easier to create and merge loci of execution. Definitely read section 4 of the paper for a full description of the motivation and use cases. Thanks very much, Eric!

Other progress

All subgroups continued progress, more of which will no doubt be covered in other trip reports. Here are a few more highlights…

SG1 (Concurrency): Concurrent queues “might” make C++26, and this is one of the most compelling demonstrations of the “sender/receiver” design pattern in the new std::execution that was adopted at our previous meeting. Concurrent queues would also be (finally) the first concurrent data structure in the standard. Also, although SG1 has been trying to fix/specify memory_order_consume since C++11, this has not succeeded, so the feature has now been removed.

SG7 (Compile-Time Programming): Made progress on several papers. Perhaps the most visible decision was to decide on a reflection syntax, ^^, which apparently some people are lobbying hard to call the “unibrow operator.”

SG15 (Tooling): Progressed work on the Ecosystem IS (International Standard), which achieved the milestone of being approved to be forwarded to the main design subgroups.

SG20 (Education): Progressed work on creating C++ teaching guidelines about topics that should be taught in C++ educational settings. Encouraged paper authors and chairs to send papers to SG20 for teachability feedback.

SG21 (Contracts): Spent Wednesday in the Core working group (instead of meeting separately) because EWG sent contracts to Core aiming for C++26, so the contracts experts went to Core to help with wording review. Continued work on trying to increase consensus on some design details.

SG23 (Safety and Security): Approved several papers to progress, including to reduce undefined behavior in the language and specifically time travel optimizations, and to send an initial set of core safety profiles to EWG aiming for C++26. The proposal P3390 “Safe C++” by Sean Baxter was seen and received support as a direction that can be pursued in addition to and complementary with Profiles, where Profiles are useful to define subsets of current C++ and its features and reducing its undefined behaviors to reduce unsafety in current C++, and proposals like Safe C++ can be useful to propose new extensions for an expanded C++ to try to achieve larger/stronger safety guarantees not possible in just a subset/constrained C++. The SG23 vote on which to prioritize was 19:11:6:9 for Profiles:Both:Neutral:SafeC++.

EWG (Language Evolution Working Group) forwarded contracts aiming for C++26, understanding however that there are still a few unresolved contentious design points; the groups are working on increasing consensus between now and the next meeting. Adopted the ^^ operator for reflection. Pattern matching is still trying to make C++26. Three safety papers progressed: P3081 core safety profiles (see below; for detailed telecon review between meetings and aiming for approval at our next meeting), P3471R0 “Standard library hardening” by Konstantin Varlamov and Louis Dionne which passed unanimously (no one even neutral!), and P2719 “Type-aware allocation and deallocation functions” by Louis Dionne and Oliver Hunt which offers safety mitigation building blocks. The group also approved trivial relocatability for C++26.

LEWG (Library Evolution Working Group) reviewed 30 papers this week. LEWG is specifically prioritizing its time on these topics as papers are available: relocatability, parallel algorithms, concurrent queues, constexpr containers, safety profiles, and pattern matching.

Thank you to all the experts who worked all week in all the subgroups to achieve so much this week!

What’s next

Our next meeting will be in Hagenberg, Austria hosted by University of Applied Sciences Upper Austria, RISC Software GmbH, and Softwarepark Hagenberg.

Thank you again to the over 220 experts who attended on-site and on-line at this week’s meeting, and the many more who participate in standardization through their national bodies!

C++ is a living language and moving fast. Thank you again to everyone reading this for your interest and support for C++ and its standardization.

Coda: My papers at this meeting

My papers aren’t what’s most important, but since this is my trip report I should mention my papers.

I had eight papers at this meeting, of which seven were seen at the meeting and one will be seen in an upcoming telecon. Here they are in the rough order they were seen…

P3437R0 “Principles: Reflection and generation of source code”

On Monday, in the SG7 (Compile-Time Programming) subgroup I presented P3437R0 “Principles: Reflection and generation of source code.”

In a nutshell, P3437 advocates for the principle that reflection and generation are primarily for automating reading and writing source code, the way a human programmer would do but so the human doesn’t have to do it by hand. I argue that this view implies that, at least by default:

reflecting a class (or other entity) in consteval compile-time code should be able to see everything a human could know from reading the source code of the class’s definition; and, similarly, generating a class in consteval code should be able to write anything the human programmer could write (including things like adding specializations to std) and have the same language meaning as if the programmer wrote it by hand (including normal name lookup and access control).

Non-default modes might do more or different things, but I argued that “as if the human read/wrote the source code” should be the default semantics and get the “nice” syntax that C++ programmers would naturally expect unless they were doing something special.

The group agreed and approved the paper, 15:8:3 in favor:neutral:against.

P3439 “Chained comparisons: Safe, correct, efficient”

On Tuesday, I presented P3439 “Chained comparisons: Safe, correct, efficient” in the language evolution working group (EWG). This paper proposes that comparison chains like a <= b < c actually work. (I’ve already implemented this in cppfront.) Today, people try to write such code, and the standard says it’s required to compile but do the wrong thing, which is not great, and all major compilers will warn about it but then compile it anyway because they must. Adopting this proposal would fix real bugs, including security bugs, just by recompiling existing code. This change would also make the language simpler, because with this feature we could write a <= b && b < c as just a <= b < c. And that would also make the language slightly faster, by naturally avoiding multiple evaluation of middle terms like b.

This is the only part of my original paper P0515R0 “Consistent comparisons” (aka the operator spaceship <=> paper) that has not yet been adopted into the standard. It was rejected before the pandemic when Barry Revzin and I tried a second time to get it adopted, but I felt it was appropriate to bring it back now because there’s new urgency and new information. A few highlights from the paper:

Today, comparison chains like min <= index_expression < max are valid code that do the wrong thing; for example, 0 <= 100 < 10 means true < 10 which means true, certainly a bug. Yet that is exactly a natural bounds check; for indexes, such subscript chains’ current meaning is always a potentially exploitable out-of-bounds violation.

[P0893R1] reported that code searches performed by Barry Revzin with Nicolas Lesser and Titus Winters found:
Lots of instances of such bugs in the wild: in real-world code “of the assert(0 <= ratio <= 1.0); variety,” and “in questions on StackOverflow [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11].”“A few thousand instances over just a few code bases” (emphasis original) where programmers today write more-brittle and less-efficient long forms such as min <= index_expression && index_expression < max because they must, but with multiple evaluation of index_expression and with bugs because of having to write the boilerplate and sometimes getting it wrong.

See the paper for more new information that warrants considering this anew.

The group voted 30:2:2 in favor:neutral:against for me to bring this back with standardese wording to the February meeting, and if things go well there it might make C++26.

Next: Three safety/security proposals for me at this meeting

Although the chained-comparisons proposal had a safety and security aspect, I don’t consider it principally a “safety proposal.” However, I did present three actual safety and security proposals at this meeting, and here they are… (Note: There were other such proposals by others too, not just by me! And I’m excited about them too. These are just my three…)

Of the four main language safety categories we have to improve (type, bounds, initialization, and lifetime), the first three are easier and are the focus of P3436 and P3081. Lifetime is the hardest to improve in C++, and requires more effort, in my proposal’s case writing a static analysis rule engine; but I think major improvement is possible without heavy annotation of existing code, and this is the subject of P3465 (promoting P1179).

Safety #1 of 3: P3436R0 “Removing safety-related undefined behavior by default”

On Wednesday morning in SG23 (Safety and Security), I presented P3436R0 “Removing safety-related undefined behavior (UB) by default.” This would be a pretty big deal… as I mentioned in the presentation, I firmly believe that most observers of C++, even friendly ones, do not expect us to be able to “do something significant” about UB in C++. UB is after all a fundamental part of C++, right? … Right?

My conjecture is “no it isn’t, and the point of this paper is easy to summarize:

C++ already bans undefined behavior in a huge subset of the language: constexpr evaluation. A lot of people do not realize that, yes, we already did that… we already snuck a lot of safety into the language while no one was paying particular attention. Those who say UB is endemic throughout C++ have already started to fall behind the times, since we added and then gradually expanded constexpr.Let’s enumerate each such case of UB already prevented in constexpr evaluation, and prevent it in normal execution too: either always (if it’s cheap and easy, as we just did with uninitialized local variables), or else under a Profile (so any overhead has an easy way to opt-in/opt-out when you do/don’t want the safety).

I think most people don’t expect that C++ could make a major dent in undefined behavior and still be true to C++. But P3436 makes the bold conjecture that maybe we can not only make a dent, but actually eliminate UB in C++ by default when safety Profiles are enabled, by combining (a) that we already do it in constexpr code, with (b) that we plan to have Profiles as a tool to opt in/out of safety modes that require source changes or execution overheads.

The group agreed and voted unanimously 25:3:0 in favor:neutral:against.

Important reality check here: This is the start, not the end… this doesn’t mean the proposal is done, it means the group said “we like the idea, now go do the hard work of actually listing all the UB cases and how you propose to handle each one and writing up standard specification wording for all that.” Most “encouragement” polls in the committee actually mean “yes please go do more work.” (This takes a lot of getting used to for new participants.)

Safety #2 of 3: P3465R0 “Pursue P1179 as a Lifetime TS”

On Wednesday after lunch in SG23 (Safety and Security), I made the first-ever presentation in WG21 of the C++ Core Guidelines Lifetime safety static analysis profile, of which I’m the primary designer (with lots of expert help; big thanks again everyone listed in P1179!). This analysis catches many common lifetime dangling issues (not just for pointers, but for generalized Pointers including iterators and views), usually with little or no source annotation. This is a full portable “static analysis specification” which means a detailed spec of state and state transitions that an implementation should do in a function body, so that different implementations will give the same answers.

I first publicly presented this work, with live demos of the first prototype, in my CppCon 2015 talk (starting at 30:28). I then published the detailed specification on GitHub and cc’d/FYI’d ISO via the paper P1179 “Lifetime safety: Preventing common dangling” back in 2019, but at that point I was just giving the committee an FYI… I never asked to present it for adoption, in part because I wanted to gain more usage experience, and in part because at the time the committee did not yet have this kind of safety as a first-order priority.

Now, times have changed: Parts of this have now been implemented and shipped in Microsoft Visual Studio, JetBrains CLion, a Clang fork, and even a smidgen in Clang trunk I’m told. And safety/security is finally a first-order concern, so the current paper P3465R0 “Pursue P1179 as a Lifetime TS” points back to P1179 and suggests that the time is ripe to turn it into an ISO C++ Technical Specification. I proposed that we pursue turning this analysis specification into a Technical Specification (TS) separate from the standard in order to get one more round of WG21-endorsed experience before we cast it in stone as a lifetime Profile: While implementations have implemented a lot of the design, they haven’t yet implemented all of it, and I think making it a TS would show WG21 interest and spur them to complete their implementations so that we could validate the last few important parts too on large amounts of real-world code, and make any needed adjustments. If all goes well with that, I hope to propose it for the standard itself in the future.

The group voted 24:0:1 in favor:neutral:against to direct me to turn P1179 into a working paper for a Technical Specification. Fortunately, the specification of state and state transitions is already very concrete, so it’s already at approximately the same level of detail as normal C++ language specification wording.

Safety #3 of 3: P3081R0 “Core safety Profiles: Specification, adoptability, and impact”

My third safety/security paper was P3081R0 “Core safety Profiles: Specification, adoptability, and impact.” I presented it Wednesday afternoon in SG15 (Tooling), Thursday morning in SG23 (Safety and Security), and Friday morning in EWG (the main language evolution group), with a target of C++26.

This is a companion paper to my “C++ safety, in context” blog essay this spring; see that essay for the full motivation, context, and rationale. P3081 contains the concrete proposed semantics:

Proposes a concrete initial set of urgently needed enforced safety Profiles.Described how a Profiles implementation can prioritize adoptability and safety improvement impact, especially to silently fix serious bugs in existing code by just recompiling the code (i.e., no manual code changes required) where possible, and making it easy to opt into broadly applicable safety profiles.

I tried to push the boundaries of what’s possible in C++ by suggesting we do something we’ve never done before in the ISO C++ standard: Have the Standard actually normatively (i.e., as a hard requirement) require implementations (usually compilers, and not third-party post-build tools) to offer safety-related “fixits” to automatically correct C++ code where the fix can be super reliable. We’ve never formally required anything like this before, but I felt it was important to try to push this boundary to raise the bar for all compilers because it’s so important for safety adoptability… yet I really wasn’t sure how this suggestion would fly (or get shot out of the sky, as boundary-pushing proposals tend to be).

The first presentation was on Wednesday after lunch to SG15 (Tooling) because the paper suggested this novelty of requiring C++ compilers to offer automatic fixits. The votes on two related polls were both unanimously 8:2:0 in favor:neutral:against. People in the room included tooling and compiler experts for GCC and Clang, and their main comment was (slightly paraphrased) ‘yeah, sure, this is 2024, our C++ compilers all already offer fixits, let’s require compilers to do it consistently.’ Whew.

The second presentation was on Thursday after breakfast to SG23 (Safety and Security), which has been encouraging work on Profiles but has not yet had any concrete proposal to forward to EWG, the main evolution working group. SG23 gave feedback, and then voted unanimously 23:1:0 in favor:neutral:against to forward P3081 to EWG specifically targeting C++26, including that Bjarne’s paper and this one should be merged to reflect the syntax decisions made earlier in the day based on Bjarne’s paper. This is the first Profiles proposal to be forwarded from SG23.

The third presentation was on Friday after breakfast to EWG, which voted 44:4:3 in favor:neutral:against to pursue this paper for C++26, and schedule teleconferences between now and February to go line-by-line through the paper in detail.

Disclaimer: Note this means we have to do a ton of work in the next few months, if it is to have a hope of actually making C++26.

P2392R3 “Pattern matching using is and as”

On Thursday in EWG, I presented P2392R3 “Pattern matching using is and as.” There are two main parts to this paper: is/as expressions to unify and simplify safe queries/casts, and inspect pattern matching that uses the is/as approach.

This time the results were quite mixed, with no consensus to encourage proceeding with my proposal: For the whole paper, EWG voted 19:6:22 in favor:neutral:against. For just the is/as portion, EWG voted 21:8:20 in favor:neutral:against. Clearly no consensus at all, never mind not strong encouragement, and the competing proposal from Michael Park got 33:6:10. But I’m not giving up P2392… I’ll try to incorporate the feedback heard in the room, and perhaps improve consensus at the next meeting or two.

P3466R0 “(Re)affirm design principles for future C++ evolution”

Finally, on Friday afternoon in EWG, I presented P3466 “(Re)affirm design principles for future C++ evolution.” Note: I presented the initial revision R0, and this link is to a draft revision R1 that incorporates the direction from the group (but my R1 edits are still being reviewed to make sure I applied EWG’s direction correctly).

The summary up front is:

C++ is a living language that continues to evolve. Especially with C++26 and compile-time programming, and new proposals for type and memory safety, we want to make sure C++ evolution remains as cohesive and consistent as possible so that (a) it’s “still C++” in that it hews to C++’s core principles, and (b) it’s delivering the highest quality value to make C++ code safer and simpler.

The Library Evolution WG has adopted written design principles to guide proposals and their discussion. This paper proposes that the Language Evolution WG also adopt written principles to guide new proposals and their discussion, and proposes the principles in this paper as a starting point.

EWG voted to turn this paper into a new Standing Document (which future papers can add to) to document EWG’s values: 29:22:2 in favor:neutral:against.

P0707R5 “Metaclass functions for generative C++”

The committee didn’t have time to consider all papers at the meeting, and my paper P0707R5 “Metaclass functions for generative C++” is one that got deferred to be considered at a between-meetings Zoom telecon over the winter.

That’s it for my papers this time… whew. More news next time, from Austria…

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on November 24, 2024 22:00

November 11, 2024

A new chapter, and thoughts on a pivotal year for C++

Starting today I’m excited to be working on a new team, with my C++ standards and community roles unchanged. I also wanted to write a few words about why I’m excited about continuing to invest my time heavily in C++’s standardization and evolution especially now, because I think 2024 has been a pivotal year for C++ — and so this has turned into a bit of an early “year-end C++ retrospective” post too.

It’s been a blast to be on the Microsoft Visual C++ compiler team for over 22 years! The time has flown by because the people and the challenges have always been world-class. An underappreciated benefit of being on a team that owns a foundational technology (like a major C++ compiler) is that you often don’t have to change teams to find interesting projects, because new interesting projects need compiler support and so tend to come to you. That’s been a real privilege, and why I stuck around way longer than any other job I’ve held. Now I am finally going to switch to a new job, but I’ll continue to cheer my colleagues on as a happy MSVC user on my own projects, consuming all the cool things they’re going to do next!

Today I’m thrilled to start at Citadel Securities, a firm that “combines deep trading acumen with leading-edge analytics and technology to deliver liquidity to some of the world’s most important markets, retail brokerages, and financial institutions.” I’ve known folks at CitSec for many years now (including some who participate in WG 21) and have long known it to be a great organization with some of the brightest minds in engineering and beyond. Now I’m looking forward to helping to drive CitSec’s internal C++ training initiatives, advise on technical strategy, share things I’ve learned along the way about sound design for both usability and pragmatic adoptability, and mentor a new set of talented folks there to not only take their own skilled next steps but also to themselves become mentors to others in turn. I think a continuous growth and learning culture like I’ve seen at CitSec consistently for over a dozen years is one of the most important qualities a company can have, because if you have that you can always grow all the other things you need, including as demands evolve over time. But maybe most of all I’m looking forward to learning a lot myself as I dive back into the world of finance — finance is where I started my junior career in the 80s and 90s, and I’m sure I’ll learn a ton in CitSec’s diverse set of 21st-century businesses that encounter interesting, leading-edge technical challenges every day that go well beyond the ones I encountered back in the 20th.

My other C++ community roles are unchanged — I’m continuing my current term as chair of the ISO C++ committee, I’m continuing as chair of the Standard C++ Foundation, and especially I’m continuing to work heavily on ISO C++ evolution (I have eight papers in the current mailing for this month’s Wrocław meeting!) including supporting those with cppfront prototype implementations. I meant it when I said in my CppCon talk that C++’s next decade will be dominated by reflection and safety improvements, and that C++26 really is shaping up to be the most impactful release since C++11 that started a new era of C++; it’s an exciting time for C++ and I plan to keep spending a lot of time contributing to C++26 and beyond.

Drilling down a little: Why is 2024 a pivotal year for C++? Because for the first time in 2024 the ISO committee has started adopting (or is on track to soon adopt) serious safety and reflection improvements into the draft C++ standard, and that’s a big turning point:

For safety: With uninitialized local variables no longer being undefined behavior (UB) in C++26 as of March 2024, C++ is taking a first serious step to really removing safety-related UB, and achieve the ‘holy grail’ of an easy adoption story: “Just recompile your existing code with a C++26 compiler, with zero manual code changes, and it’s safer with less UB.” This month, I’m following up on that proposing P3436R1, a strategy for how we could remove all safety-related UB by default from C++ — something that I’m pretty sure a lot of folks can’t imagine C++ could ever do while still remaining true to what makes C++ be C++, but that in fact C++ has already been doing for years in constexpr code! The idea I’m proposing is to remove the same cases of UB we already do in constexpr code also at execution time, in one of two ways for each case: when it’s efficient enough, eliminate that case universally the same as we just did for uninitialized locals; otherwise, leverage the great ideas in the Profiles proposals as a way to opt in/out of that case (see P3436 for details). If the committee likes the idea enough to encourage me to go do more work to flesh it out, over the winter I’ll invest the time to expand the paper into a complete catalog of safety-related UB with a per-case proposal to eliminate that UB at execution time. If we can really achieve a future C++ where you can “just recompile your existing code with safety Profiles enabled, and it’s safer with zero safety-related UB,” that would be a huge step forward. (Of course, some Profiles rules will require code changes to get the full safety benefits; see the details in section 2 of my supporting Profiles paper.)For reflection: Starting with P2996R7 whose language part was design-approved for C++26 in June 2024, we can lay a foundation to then build on with follow-on papers like P3294R2 and P3437R1 to add generation and more features. As I demonstrated with examples in the above-linked CppCon talk, reflection (including generation) will be a game-changer that I believe will dominate the next decade of C++ as we build it out in the standard and learn to use it in the global C++ community. I’m working with P2996/P3294 prototypes and my own cppfront compiler to help gather usability experience, and I’m contributing my papers like P0707R5 and P3437R1 as companion/supporting papers to those core proposals to try to help them progress.

As Bjarne Stroustrup famously said, “C++11 [felt] like a new language,” starting a new “modern” C++ style featuring auto and lambdas and standard safe smart pointers and range-for and move semantics and constexpr compile-time code, that we completed and built on over the next decade with C++14/17/20/23. (And don’t forget that C++11’s move semantics already delivered the ideal adoption story of “just recompile your existing code with a C++11 compiler, with zero manual code changes, and it’s faster.”) Since 2011 until now, “modern C++” has pretty much meant “C++ since C++11” because C++11 made that much of a difference in how C++ worked and felt.

Now I think C++26 is setting the stage to do that again for a second time: Our next major era of what “modern C++” will mean will be characterized by having safety by default and first-class support for reflection-based generative compile-time libraries. Needless to say, this is a group effort that is accomplished only by an amazing set of C++ pros from dozens of countries, including the authors of the above papers but also many hundreds of other experts who help design and review features. To all of those experts: Again, thank you! I’ll keep trying to contribute what I can too, to help ship C++26 with its “version 1” of a set of these major new foundational tools and to continue to add to that foundation further in the coming years as we all learn to use the new features to make our code safer and simpler.

C++ is critically important to our society, and is right now actively flourishing. C++ is essential not only at Citadel Securities itself, but throughout capital markets and the financial industry… and even that is itself just one of the critical sectors of our civilization that heavily depend on C++ code and will for the foreseeable future. I’m thrilled that CitSec’s leadership shares my view of that, and my same goals for continuing to evolve ISO C++ to make it better, especially when it comes to increasing safety and usability to harden our society’s key infrastructure (including our markets) and to make C++ even easier to use and more expressive. I’m excited to see what the coming decade of C++ brings… 2024 really has shaped up to be a pivotal year for C++ evolution, and I can’t wait to see where the ride takes us next.

View more on Herb Sutter's website »

Like • 0 comments • flag

Published on November 11, 2024 04:15