inline assembly is supported since rust 1.59, we're way past that.
enabling this makes the compiled code more compact, and on rp2040
even decreses memory usage by not needing thunks in sram.
This introduces a `Pender` struct with enum cases for thread-mode, interrupt-mode and
custom callback executors. This avoids calls through function pointers when using only
the thread or interrupt executors. Faster, and friendlier to `cargo-call-stack`.
`embassy-executor` now has `arch-xxx` Cargo features to select the arch and to enable
the builtin executors (thread and interrupt).