EASA's W-Shape Process Is a Real Paradigm Shift. It Does Not Solve the Incompleteness Problem.
EASA's Proposed Issue 03 of its AI Concept Paper introduces the W-shape development process as the most credible path for certifying AI-based aerospace systems in Europe, but it does not resolve the fundamental incompleteness problem inherent to machine learning. Founders have until 12 August 2026 to comment on guidance that feeds directly into binding rulemaking under RMT.0742.
&w=3840&q=75)
Photo: Google DeepMind / Pexels
There is a moment in every AI certification project when the team first opens EASA's guidance and realises that the cherished V-model, their anchor for a decade of DO-178C work, is not going away, but a new shape is growing in front of it. That shape is the W. The debate it has ignited among experienced aerospace engineers is the most substantive technical argument this industry has had in years.
EASA released Proposed Issue 03 of its Concept Paper on Artificial Intelligence on 3 June 2026, now open for a ten-week consultation period [2]. It is the final Concept Paper deliverable foreseen under the EASA AI Roadmap 2.0 [2]. Building on Issue 02, which explored Level 1 and Level 2 AI, the new issue completes the technical scope of the Roadmap, broadens the framework to address additional AI techniques including reinforcement learning and symbolic AI, and opens Level 3 AI applications, corresponding to "advanced automation" [2]. Stakeholders are invited to provide their comments no later than 12 August 2026 [3]. If you are building an AI-enabled aerospace or defence product in Europe, that consultation window is the lever you should be pulling right now.
What the W-Shape Actually Does, and Why It Is Not Merely Cosmetic
Let me be precise about what EASA is proposing, because sloppy summaries have already started circulating.
While the traditional V-model remains the backbone of safety-critical system development in aviation, it falls short in addressing the iterative, data-driven, and non-deterministic nature of the development process for AI-based systems [1]. The core problem is structural: the functional behaviour of an ML model is mainly specified by a bulk of parameters automatically adjusted during training, and it is practically impossible to map and trace those parameters to specific functions. As a result, DO-178C traceability objectives are not achievable for an ML model [1].
The W-shape is EASA's architectural response to that incompatibility. The document is precise about what the first V is actually doing: it does not seek to restore full traceability of requirements to implementation, but rather to establish controlled confidence in the data, scenarios, and knowledge that drive the AI-based system's intended behaviour [1]. That is a deliberate philosophical choice, not an oversight. The agency knows it cannot reconstruct code-level traceability and is not pretending otherwise.
To support this first V, an additional architectural element is introduced between the classical system and item layers: the AI constituent [1]. This layer ensures proper handling of AI-specific considerations, such as ODD definition, iterative design, and verification, before implementing the AI constituent and integrating it into the broader AI-based system architecture. That flexibility matters practically. A startup integrating a perception model into a drone's sensor fusion stack should not need to restructure its entire type-design certification basis to accommodate one AI constituent.
The ML constituent, a concept initially introduced by EASA and incorporated into the draft of ED-324, is defined as a bounded collection of hardware and/or software items, at least one of which contains an ML model, and it modifies conventional development assurance practices by occupying an intermediate level between the system and item [1].
Alongside this, the Operational Domain and Operational Design Domain add a further layer of structure. A correct and as complete as possible definition of the ODD is a prerequisite to an adequate level of quality of the data sets, scenarios, and knowledge bases involved in the AI assurance process [1]. However, the document acknowledges that an exhaustive exploration of all possible operating conditions can be intractable for high-dimensional use cases [1]. EASA is telling you, in writing, that it knows this cannot be made exhaustive. That is intellectual honesty, and it is the right starting point for a serious conversation about what degree of bounded confidence is enough.
The parallel industry standardisation effort reflects the same architecture. EUROCAE WG-114, jointly with SAE International G-34, recently met in Brussels, hosted by EUROCONTROL, and during that plenary meeting the group successfully finalised Draft 7 of ED-324 "Process Standard for Development and Certification Approval of Aeronautical Products Implementing AI" [7]. EUROCAE WG-114 has placed the draft on its workspace and invites both members and non-members to review and submit comments, with early replies encouraged [6]. The W-shape is therefore not one regulator's idea: it is converging with the emerging industry standard.
The Strongest Counterargument, Stated Fairly
Teams with deep DO-178C and ARP4754A backgrounds will tell you something like this: "The W-shape replaces code traceability with dataset traceability. But dataset traceability is softer. I can show you a requirements-to-code matrix and verify it with structural coverage. What you show me in the first V of the W is a record of how you managed your training data and bounded your ODD. That is a process control argument, not a correctness argument. You still cannot exhaustively cover a high-dimensional ODD, you still cannot prove the model generalises correctly on truly out-of-distribution inputs, and you still end up at the second V doing verification against a constituent whose inner behaviour is not fully interpretable. Where, exactly, is the paradigm shift?"
This concern has support in the technical literature. The MLEAP project found that defining the Operational Domain was challenging, as estimating completeness and representativeness requires knowledge of the exact extent and distribution of certain phenomena, and that the currently publicly available set of tools and methods for the development of AI-based systems lack operationalizability [12]. That assessment is not hostile: it is a calibrated statement of the gap between what the process can claim and what safety engineers ultimately need.
The sceptic is right that the W-shape does not solve the fundamental incompleteness problem. No process can. The ODD cannot be exhaustively characterised; the generalisation gap cannot be mathematically zeroed out; the model's behaviour on edge cases will always be bounded by empirical confidence rather than proof.
Why the Shift Is Real, Even If Incomplete
The V-model's genius is that it makes process discipline a proxy for correctness. Do the work in the right order, maintain traceability, achieve structural coverage, and the probability of undetected errors collapses to an arguable level. That worked beautifully for deterministic software. The problem is not that the V-model is wrong. The problem is that applying it to machine learning produces a category error: you can trace your training script to its requirements all day long, but the model it produces has a functional identity that cannot be read off the source code.
It is recognised that certain AI assurance challenges stem from intrinsic limitations in the traceability of data or knowledge bases to higher-level requirements. Similarly to classical development assurance methods, AI assurance objectives focus on proportionate and structured processes to build a level of confidence on the correctness of the AI-based systems' intended function [1].
The W-shape acknowledges this category error and responds with a structurally different kind of confidence: confidence in the data generation process, the ODD definition, the representativeness of test sets, and the in-service monitoring loop. Dataset traceability is not a softer form of confidence. It is a different form of confidence appropriate to a different kind of engineered object.
The MLEAP project adds real weight to this argument. MLEAP is a research project initiated by EASA and funded under the Horizon Europe framework, tailored specifically to investigate the challenging objectives of the W-shaped process at the core of the EASA AI Concept Paper [10]. The project methodology involved identification of promising methods and tools from an extensive state of the art, followed by preliminary testing on toy use cases, and subsequently the validation of those primary results on complex aviation use cases [12]. Three aeronautical AI use cases were used: speech-to-text in air traffic control, drone collision avoidance via ACAS Xu, and vision-based maintenance inspection [11][12]. These are not toy problems. Real engineering teams working through real aviation use cases built a pipeline that a certification authority can actually review. That is a significant data point for any founder who hears "W-shape is overhead": overhead that produces no deliverables does not generate a substantial final report with EASA as contracting authority.
The incompleteness problem is also managed through explicit assurance level boundaries. With the current state of knowledge of AI and ML technology, EASA anticipates a limitation on the validity of applications when AI/ML constituents include IDAL A or B items, and no assurance level reduction should be performed for items within AI/ML constituents [1]. This limitation will be revisited when experience with AI/ML techniques has been gained [1]. The Issue 03 document goes further, noting that EASA considered the conditions under which even the largest off-the-shelf models such as LLMs can be used, relying on a strategy of performance evaluation and safe integration [1]. These are not arbitrary constraints. They are the honest expression of what level of confidence the W-shape process can currently provide. Founders planning products at or approaching high-criticality IDAL levels should engage the authority early and explicitly on their certification basis.
For Founders: What to Do Now
Submit a consultation comment before 12 August 2026. The document will feed directly into RMT.0742, which has subtasks covering a proposal for an AI trustworthiness aviation regulatory framework, development of generic AI-related acceptable means of compliance and guidance material, and necessary adaptations to domain-specific regulatory material [13]. That is a lever most founders do not realise they have. A two-page substantive comment on a specific section costs a day of engineering time and can shift a definition that will later bind your product.
Adopt the W-shape as your internal development standard now, before it becomes mandatory. The friction cost is low. The documentation structure, the ODD definition process, the AI constituent decomposition: these are sensible engineering practices regardless of their certification status. The return is high. When your first Certification Review Item negotiation opens with the authority, you arrive with artefacts already aligned to the framework they are trying to use. The document aims at guiding aviation applicants and organisations when introducing any AI technology in safety-related applications, while noting that it does not constitute at this stage definitive or detailed guidance [1]. That framing matters: the guidance is still forming, and early movers who submit substantive comments genuinely influence the document that will later bind them.
Start with your ODD definition. This is the single highest-leverage document in the entire first V. A tight, well-evidenced ODD signals to any authority that your team understands the operational boundary of the system. A vague or overambitious ODD is the fastest path to a difficult CRI. The MLEAP project itself found that defining the Operational Domain was challenging, as estimating completeness and representativeness requires knowledge of the exact extent and distribution of certain phenomena [12]. Write your ODD as if you will be challenged on every parameter boundary, because you will be.
Track ED-324 in parallel. Panelists at the EUROCAE 2026 Symposium agreed that ED-324 accommodates the adaptive and data-driven nature of AI, while maintaining the high levels of safety expected in aviation [7]. Draft 7 is currently in open consultation. The EUROCAE page lists the consultation end date as mid-September, without a pinned calendar date [6]. Check eurocae.net/open-consultation-for-ed-324 directly for any update to that deadline. The first version of the standard will be limited to frozen ML models developed through supervised learning, which defines its immediate scope for your certification basis [7]. A future Issue 2 will follow, covering more ML techniques such as reinforcement learning [7]. Align to ED-324 now rather than retrofitting later.
Do not mistake process compliance for genuine safety assurance. The W-shape is the best structured path currently available for building certifiable confidence in AI-based aerospace systems. It is not a proof of safety. It is a disciplined argument for bounded, managed uncertainty. Founders who internalise that distinction will build better systems, have more credible authority interactions, and avoid the painful surprise of believing they were safe because their paperwork was complete.
The paradigm has genuinely shifted. The incompleteness remains. Both things are true, and holding both simultaneously is what serious aerospace engineering looks like.
Sources
[1] easa.europa.eu
[2] easa.europa.eu
[3] easa.europa.eu
[4] mdpi.com
[5] arxiv.org
[6] eurocae.net
[8] arxiv.org
[9] arxiv.org
[10] easa.europa.eu
[11] easa.europa.eu
[12] easa.europa.eu
[13] easa.europa.eu
&w=3840&q=75)
&w=3840&q=75)
&w=3840&q=75)