slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Rename some symbols related to pointers types (#8592)	Theresa Foley	2025-10-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note that while this change touched a large numer of files, there are no changes to functionality being made here. The only things being done are renaming various symbols and, in a few cases, updating or adding comments for consistency with the new names. The core of the naming changes are: * Most things named to refer to `OutType` (e.g., `IROutType`, `IRBuilder::getOutType()`, etc.) have been consistently renamed to refer to `OutParamType`, to emphasize that the relevant AST/IR node types are only intended for use to represent `out` parameters. * The same change as described above for `OutType` is also made for `RefType`, which becomes `RefParamType` in most cases. One mess that this exposes is the way that the `ExplicitRef<T>` type in the core module currently lowers to `IRRefParamType`. This change sticks to the rule of not making functional changes, so that mess is left as-is for now. * Names referring to `InOutType` have been changed to instead refer to `BorrowInOutType`. The intention with this naming change is to emphasize that the Slang rules for `inout` are semantically those of a borrow (or at least our interpretation of what a borrow means). * Names referring to `ConstRefType` have been changed to instead refer to `BorrowInType`. This change starts work on clarifying that the existing `__constref` modifier was never intended to be a read-only analogue of `__ref`, and instead is the input-only analogue of `inout`. * The `ParameterDirection` enum type has been changed to `ParamPassingMode`, to reflect the fact that the concept of "direction" fails to capture what is actually being encoded, particularly once we have modes beyond simple `in`/`out`/`inout`. While this change does not alter behavior in any case (the user-exposed Slang language is unchanged), it is intended to set up subsequence changes that will work to make the handling of these types in the compiler more nuanced and correct. Breaking this part of the change out separately is primarily motivated by a desire to minimize the effort for reviewers. --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
*	Fixup address spaces after inlining. (#7731)	Yong He	2025-07-11
\| \| \| \| \|	* Fixup address spaces after inlining. * add -O0
*	extend fiddle to allow custom lua splices in more places (#7559)	Ellie Hermaszewska	2025-07-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add fkYAML submodule * Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml * generate ir-inst-defs.h * neaten things * neaten inst def parser * add rapidyaml submodule * remove fkyaml * remove fkyaml submodule * remove use of ir-inst-defs.h * format and warnings * fix wasm build * tidy * remove rapidyaml * Extend fiddle to allow custom splices in more places * Use lua to describe ir insts * fix * neaten * neaten * neaten * spelling * neaten * comment comment out assert * merge
*	Add debug information for slang inling (#6621)	Mukund Keshava	2025-05-10
\|
*	Move switch statement bodies to their own lines (#5493)	Ellie Hermaszewska	2024-11-05
\| \| \| \| \| \| \| \| \|	* Move switch statement bodies to their own lines * format --------- Co-authored-by: Yong He <yonghe@outlook.com>
*	format	Ellie Hermaszewska	2024-10-29
\| \| \| \| \| \| \|	* format * Minor test fixes * enable checking cpp format in ci
*	Replace the word stdlib or standard-library with core-module for source code ↵	Jay Kwak	2024-10-28
\| \| \| \| \|	(#5415) This commit changes the word "stdlib" or "standard library" to "core module" in the source code.
*	Avoid inlining functions with inline ASM blocks. (#4925)	Sai Praveen Bangaru	2024-08-27
\|
*	Tuple swizzling, concat, comparison and `countof`. (#4856)	Yong He	2024-08-19
\| \| \| \| \| \| \| \| \| \| \|	* Tuple swizzling and element access. * Update proposal status. * Cleanup. * Fix merrge error. * Address review.
*	Add options to speedup compilation. (#4240)	Yong He	2024-05-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add options to speedup compilation. * Fix. * Plumb options to DCE pass. * Revert debug change. * Fix regressions. * More optimizations. * more cleanup and fixes. * remove comment. * Fixes. * Another fix. * Fix errors. * Fix errors. * Add comments.
*	Add `-minimum-slang-optimization` to favor compile time. (#4186)	Yong He	2024-05-17
\|
*	WIP: Force Inline If RefType (#4005)	ArielG-NV	2024-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Force Inline if reftype Fixes #3997. If we are using a refType, we now ForceInline. remarks: 1. Modifications were made in slang-ir-glsl-legalize to change how we translate GlobalParam proxy's into GlobalParam. a. We now handle the senario where a globalParam is used in multiple disjoint blocks (like 2 different functions). * try to figure out why CI fails but local works try to inline DispatchMesh, works locally, may fail on CI(?) * try another fix * add task tests + don't allow semi-early task-shader inline Task shader uses DispatchMesh which is a very big 'hack' where we check for the function name and modify the callees in very large ways. This function does inline, but it cannot inline early due to future mangling that this operation requires todo. This is reflected with the `[noRefInline]` modifier. It is a modifier so users may stop mandatory inlines with `__ref` parameter.
*	SPIRV Legalization fixes. (#3479)	Yong He	2024-01-23
\| \| \| \| \| \| \| \| \| \| \|	* Fix CFG legalization for SPIRV backend. * Emit DepthReplacing execution mode. * Fix do-while lowering. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Add ConstBufferPointer::subscript. (#3415)	Yong He	2023-12-15
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Run curated spirv-opt passes through slang-glslang. (#3266)	Yong He	2023-10-09
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Run curated spirv-opt passes through slang-glslang. * Cleanup. * Replace spirv-dis downstream compiler with glslang. * delete slang-spirv-opt.cpp. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	SPIRV compiler performance fixes. (#3258)	Yong He	2023-10-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* SPIRV compiler performance fixes. * Cleanup. * update project files * Cleanup debug code. * Make redundancy removal non-recursive. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Revert inlining change in #3217. (#3229)	Yong He	2023-09-21
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Move force inlining step to before `processAutodiffCalls` (and run in loop) ↵	Sai Praveen Bangaru	2023-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	(#3217) * Move auto-diff force inlining step to before `processAutodiffCalls` * Fix `replaceUsesWith` to handle existing inst defined after current use. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Don't inline callees with custom derivative before autodiff. (#3196)	Yong He	2023-09-08
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Fix GLSL code gen around RayQuery and HitObject types. (#3173)	Yong He	2023-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Update slang-llvm. * Fix. * fix. * Fix unit tests for multi-thread execution. * Fix tests. * fixes. * update tests. * Add gfx-smoke to linux expected failure list. * Try fix test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Inline all RayQuery/HitObject typed functions when targeting GLSL. (#3172)	Yong He	2023-08-31
\| \| \| \| \| \| \| \| \|	* Inline all RayQuery/HitObject typed functions when targeting GLSL. * update test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Add SPIRV atomics intrinsics and fix buffer layout lowering. (#3170)	Yong He	2023-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix atomics intrinsics and buffer layout lowering. * Fix. * Add more test. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Fix memory barrier intrinsics. (#3166)	Yong He	2023-08-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix memory barrier intrinsics. Makes them produce the same spirv code as dxc. * Fix. * filecheck barrier test for spirv backend. * Fix glsl intrinsic definition. * Fix intrinsics. * Fix intrinsics. * Fix. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Add more wave intrinsics. (#3162)	Yong He	2023-08-29
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Add `target_switch` and `intrinsic_asm` statement. (#3154)	Yong He	2023-08-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add `target_switch` and `__intrinsic_asm` statement. * Cleanup. * WaveGetActiveMask, WaveGetActiveMask, WaveCountBits. * WaveIsFirstLane. * More wave intrinsics. * wave intrinsics. * merge fix. * Fix. * Fix. * Update test. * update test. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Misc. SPIRV Fixes, Part 2. (#3147)	Yong He	2023-08-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Misc. SPIRV Fixes, Part 2. * Fix up. * Fix. * Add system smenatic values. * 16 bit int and floats, matrix/vector reshape, bool ops. * Fix. * Fix. * Allow push constant entry point params. * entrypoint params. * swizzleSet and swizzledStore. * packoffset. * string hash. * Fix. * Matrix arithmetics. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Use ankerl/unordered_dense as a hashmap implementation (#3036)	Ellie Hermaszewska	2023-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Correct namespace for getClockFrequency * missing const * Add missing assignment operator * Remove unused variables * Return correct modified variable * Use stable hash code for file system identity * terse static_assert * Structured binding for map iteration * Make (==) and getHashCode const on many structs * Add ConstIterator for LinkedList * Replace uses of ItemProxy::getValue with Dictionary::at * Extract list of loads from gradientsMap before updating it * Const correctness in type layout * Add unordered_dense hashmap submodule * Use wyhash or getHashCode in slang-hash.h * refactor slang-hash.h * Use ankerl/unordered_dense as a hashmap implementation Notable changes: - The subscript operator returns a reference directly to the value, rather than a lazy ItemProxy (pair of dict pointer and key) slang-profile time (95% over 10 runs): - Before: 6.3913906 (±0.0746) - After: 5.9276123 (±0.0964) * 64 bit hash for strings So they have the same hash as char buffers with the same contents * Narrowing warnings for gcc to match msvc * revert back to c++17 * Correct c++ version for msvc * Use path to unordered_dense which keeps tests happy * Do not assign to and read from map in same expression * Remove redundant map operations in primal-hoist * Split out stable hash functions into slang-stable-hash.h * 64 bit hash by default * regenerate vs projects * Correct return type from HashSetBase::getCount() * correct width for call to Dictionary::reserve * Use stable hash for obfuscated module ids * Signed int for reserve * clearer variable naming * Parameterize Dictionary on hash and equality functors * Allow heterogenous lookup for Dictionary * missing const * Use set over operator[] in some places * Remove unused function * s/at/getValue
*	Add perf benchmark utility. (#2977)	Yong He	2023-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Add perf benchmark utility. * Update documentation. * Fix. * Fix doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Dictionary using lowerCamel (#2835)	jsmall-nvidia	2023-04-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
*	Fix inlining. (#2786)	Yong He	2023-04-10
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Cleaner impl of unary stdlib derivative functions. (#2785)	Yong He	2023-04-10
\| \| \| \| \| \| \| \| \| \| \|	* Cleaner impl of unary stdlib derivative functions. * fixup * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Remove `SharedIRBuilder`. (#2657)	Yong He	2023-02-16
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Overhaul global inst deduplication and cpp/cuda backend. (#2654)	Yong He	2023-02-16
\| \| \| \| \| \| \| \| \|	* Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
*	Fixes for crash when inlining at global scope (#2593)	Theresa Foley	2023-01-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fixes for crash when inlining at global scope Recent changes to the way inlining is implemented in the Slang compiler have broken certain scenarios involving `static const` declarations. The basic problem is that the initial-value expression for a `static const` gets lowered into IR code at the global scope of a module, and if that code includes `call`s to stdlib operations marked `forceInlineEarly`, then we end up trying to apply inlining to code at module scope. The current inlining operation assumes that all `call`s are in basic blocks, and that the correct way to do inlining involves splitting those blocks. This change adds logic to detect when the callee at a call site to be inlined consists of a single basic block ending in a `return`, and in that case it invokes specialized inlining logic that doesn't split basic blocks and doesn't need to care if the original `call` is in a basic block. Thus we are able to inline calls to single-basic-block `forceInlineEarly` functions called as part of the initialization for global-scope `static const` variables. This logic does not solve the problem of calls to multi-block `forceInlineEarly` functions from the global scope. Such calls cannot really be inlined. A secondary problem that arises when inlining such calls is that the callee might include local temporaries (`var` instructions) that are read and written (`load`s and `store`s), and none of those instructions should be allowed at the global scope. In the case of the functions being inlined here, the `load`/`store` operations are superfluous, and should be cleaned up by our SSA pass. The only reason that they seem to not be getting cleaned up in the case that was been triggering crashes is that the callee is a generic. The current logic for the SSA pass was skipping the bodies of generic functions, so they would not be cleaned up. This change enables the SSA pass to apply to the bodies of generic functions, and also ensures that SSA cleanups are applied before any `forceInlineEarly` functions get inlined. * fixup: liveness test outputs
*	Remove `construct` IR op. (#2555)	Yong He	2022-12-07
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Lower-to-ir no longer produce `Construct` inst. (#2553)	Yong He	2022-12-07
\| \| \|	Co-authored-by: Yong He <yhe@nvidia.com>
*	Inline functions with string param/return for GPU targets (#2544)	jsmall-nvidia	2022-12-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* #include an absolute path didn't work - because paths were taken to always be relative. * WIP inlining of functions that take or return string related types on GPU targets. * Small fixes. * Added a test. * Add checking for any getStringHash insts are valid. * Support getStringHash on CUDA. * Tweak diagnostic.
*	Fix inlining pass. (#2506)	Yong He	2022-11-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix inlining pass. * Add more check against corner cases. * Revise comments. * Fixes. * Fix premake script. * Fixes. Co-authored-by: Yong He <yhe@nvidia.com>
*	Allow multi-level breaks to break out of `switch` statements. (#2451)	Yong He	2022-10-13
\| \| \| \| \| \| \| \| \|	* Allow multi-level breaks to break out of `switch` statements. * Rename loop->region. * Add `[ForceInline]` attribute. Co-authored-by: Yong He <yhe@nvidia.com>
*	Support multi-level break + single-return conversion + general inline. (#2436)	Yong He	2022-10-10
\| \| \| \| \| \| \| \| \|	* Support multi-level break. * Single return. * Add test for inlining `void` return-type functions. Co-authored-by: Yong He <yhe@nvidia.com>
*	Clean up void returns. (#2260)	Yong He	2022-06-01
\| \| \| \| \| \| \|	* Clean up `IRReturnVoid`. * Update gitignore. Co-authored-by: Yong He <yhe@nvidia.com>
*	Improved SCCP, inlining and resource specialization passes, legalize ↵	Yong He	2022-02-25
\| \| \| \|	`ImageSubscript` for GLSL (#2146)
*	Cleanup refactoring work around the IR builder (#2061)	Theresa Foley	2021-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Cleanup refactoring work around the IR builder We have some long-term goals for the IR that require a more centralized and disciplined set of rules for how IR instructions get created/emitted. I had been working on trying to set things up so that all IR instruction creation goes through a single bottleneck point, but the non-trivial work in that branch was getting drowned out by the sheer volume of cleanup and refactoring changes. This change tries to pull together several of the more important cleanups. The big pieces are: * `IRBuilder` and `SharedIRBuilder` now protect their data members and rely on users to initialize them more directly via constructor of an `init()` method. This change affects a bunch of sites where `IRBuilder`s were created. I changed use sites to use the constructors whenever possible, and to use `init()` in cases where we had longer-lived builders that needed to be initialized multiple times. * The insertion location for the `IRBuilder` now uses an encapsulated type called `IRInsertLoc`. This new type can replace what used to be just two `IRInst` fields in the builder, and also covers some new functionality (if we ever want to take advantage of it). Very little client code cares about this change, but it is still a nice cleanup in terms of making things more explicit. The creation of an `IRModule` has been moded out of `IRBuilder`, because in practice we `IRBuilder` always wants to be associated with a pre-existing `IRModule` at creation time (via its `SharedIRBuilder`). There is now an `IRModule::create()` operation instead. This required changing the sequencing at many `IRModule` creation sites, since most had been contriving to make an `IRBuilder` first. There were also several cleanups because code had been carelessly using non-reference-counted pointers for `IRModule`s in ways that broke now that `IRModule::create()` always returns a `RefPtr`. * The core operations to actually allocate memory for IR instructions were moved into `IRModule` (since they interact with the memory pool that the module owns). These were called `createEmptyInst()` but have been renamed into `_allocateInst()`. In principle these seem like they should only be needed to be called by the `IRBuilder`, but in practice they are also needed by the IR deserialization logic. * A few core operations for emitting IR instructions that were associted with `IRBuilder` were moved to actually be methods on `IRBuilder`. First is `_findOrEmitConstant` which is the primary bottleneck for creating simple scalar constant values. Another is `_createInst` (formerly part of the templated `createInstImpl` along with `createInstWithSizeImpl`) which is the main bottleneck for allocation and initialization of any instruction other than a constant (well, the `IRModuleInst` is the other exception...). Finally, there is also `_maybeSetSourceLoc()`, which is obvious to scope inside the `IRBuilder` once it is protecting the source-location info. Notes: * The `minSizeInBytes` parameter to `_createInst()` might not actually be needed at all. At this point any `IRInst` subtypes that need data allocated for things other than their operands already get created manually via `_allocateInst` or `_findOrEmitConstant`, so I think we could remove that part. I will handle that in a subsequent cleanup if it turns out to be the case. * There is one IR pass (`slang-ir-string-hash.cpp`) that is using manual `_allocateInst()` instead of going through an `IRBuilder`. It could be easily cleaned up to not do so (and I will probably make that change down the line), but for now I wanted to avoid doing anything that wasn't close to pure refactoring if I could. * At this point in our design an `IRBuilder` is a very lightweight thing - it basically just owns the insertion location plus a source location to write into instructions. A lot of our code currently treats `IRBuilder`s like they are expensive and/or need to be re-used (which leads to them being used in more mutable/stateful ways). It is quite likely that as we clean up other aspects of the implementation of IR creation/emission we can make `IRBuilder` use feel more lightweight in ways that can streamline and simplify code. * The next step for this work is to identify the different paths that eventually lead to `_createInst()` being called, and unify them at a single bottleneck operation that can own the decisions around when to create an instruction vs. when to re-use an existing one (rather than those decisions being baked into the various `IRBuilder` subroutines that create instructions of the various subtypes). * fixup: gcc/clang C++ spec details
*	Add an accessor for IRInst opcode (#1707)	Tim Foley	2021-02-16
\| \| \| \| \| \| \| \| \|	* Add an accessor for IRInst opcode This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it). This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR. * fixup
*	Copy SourceLoc when inlining (#1692)	jsmall-nvidia	2021-02-08
\| \| \| \| \|	* #include an absolute path didn't work - because paths were taken to always be relative. * Copy source loc information when inlining.
*	Add a basc inlining facility for use in the stdlib (#1271)	Tim Foley	2020-03-11
	The main feature visible to the stdlib here is the `[__unsafeForceInlineEarly]` attribute, which can be attached to a function definition and forces calls to that function to be inlined immediately after initial IR lowering. The pass is implemented in `slang-ir-inline.{h,cpp}` and currently only handles the completely trivial case of a function with no control flow that ends with a single `return`. The lack of support for any other cases motivates the `__unsafe` prefix on the attribute. In order to test that the pass works, I modified the "comma operator" in the standard library to be defined directly (rather than relying on special-case handling in IR lowering), and then added a test that uses that operator to make sure it generates code as expected. The compute version of the test confirms that we generate semantically correct code for the operator, while the SPIR-V cross-compilation test confirms that our output matches GLSL where the comma operator has been inlined, rather than turned into a subroutine. Notes for the future: * With this change it should be possible (in principle) to redefine all the compound operators in the stdlib to instead be ordinary functions with the new attribute, removing the need for `slang-compound-intrinsics.h`. * Once the compound intrinsics are defined in the stdlib, it should be easier/possible to start making built-in operators like `+` be ordinary functions from the standpoint of the IR * The attribute and pass here could be extended to include an alternative inlining attribute that happens later in compilation (after linking) but otherwise works the same. This could in theory be used for functions where we don't want to inline the definition into generated IR, but still want to inline things berfore generating final HlSL/GLSL/whatever. * The inlining pass itself could be generalized to work for less trivial functions pretty easily; for the most part it would just mean "splitting" the block with the call site and then inserting clones of the blocks in the callee. Any `return` instructions in the clone would become unconditional branches (with arguments) to the block after the call (which would get a parameter to represent the returned value). * The "hard" part for such an inlining pass would be handling cases where the control flow that results from inlining can't be handled by our later restructuring passes. The long-term fix there is to implement something like the "relooper" algorithm to restructure control flow as required for specific targets.