pio control registers are notionally shared between state machines as
well. state machine operations that change these registers must use
atomic accesses (or critical sections, which would be overkill).
notably PioPin::set_input_sync_bypass was even wrong, enabling the
bypass on a pin requires the corresponding bit to be set (not cleared).
the PioCommon function got it right.
fixing the dma word size to 32 makes it impossible to implement any
peripheral that takes its data in smaller chunks, eg uart, spi, i2c,
ws2812, the list goes on.
compiler barriers were also not set correctly; we need a SeqCst barrier
before starting a transfer as well to avoid reordering of accesses into
a buffer after dma has started.
1414: rp: report errors from buffered and dma uart receives r=Dirbaio a=pennae
neither of these reported errors so far, which is not ideal. add error reporting to both of them that matches the blocking error reporting as closely as is feasible, even allowing partial receives from buffered uarts before errors are reported where they would have been by the blocking code. dma transfers don't do this, if an errors applies to any byte in a transfer the entire transfer is nuked (though we probably could report how many bytes have been transferred).
Co-authored-by: pennae <github@quasiparticle.net>
this reports errors at the same location the blocking uart would, which
works out to being mostly exact (except in the case of overruns, where
one extra character is dropped). this is actually easier than going
nuclear in the case of errors and nuking both the buffer contents and
the rx fifo, both of which are things we'd have to do in addition to
what's added here, and neither are needed for correctness.
sending break conditions is necessary to implement some protocols, and
the hardware supports this natively. we do have to make sure that we
don't assert a break condition while the uart is busy though, otherwise
the break may be inserted before the last character in the tx fifo.
instruction memory is a shared resource. writing it only from PioCommon
clarifies this, and perhaps makes it more obvious that multiple state
machines can share the same instructions.
this also allows *freeing* of instruction memory to reprogram the
system, although this interface is not entirely safe yet. it's safe in
the sense rusts understands things, but state machines may misbehave if
their instruction memory is freed and rewritten while they are running.
fixing this is out of scope for now since it requires some larger
changes to how state machines are handled. the interface provided
currently is already unsafe in that it lets people execute instruction
memory that has never been written, so this isn't much of a drawback for now.
pin and irq operations affect the entire pio block. with pins this is
not very problematic since pins themselves are resources, but irqs are
not treated like that and can thus interfere across state machines. the
ability to wait for an irq on a state machine is kept to make
synchronization with user code easier, and since we can't inspect loaded
programs at build time we wouldn't gain much from disallowing waits from
state machines anyway.
this mainly removes the need for explicit indexing to get the pac
object. runtime effect is zero, but arguably things are a bit easier to
read with less indexing.
this is already done during platform init. it wasn't even sound in the
original implementation because futures would meddle with the nvic in
critical sections, while another (interrupt) executor could meddle with
the nvic without critical sections here. it is only accidentally sound
now and only if irq1 of both pios isn't used by user code. luckily the
worst we can expect to happen is interrupt priorities being set wrong,
but wrong is wrong is wrong.
since we never actually *disable* these interrupts for any length of
time we can simply enable them globally. we also initialize all pio
interrupt flags to not cause system interrupts since state machine
irqa are not necessarily meant to cause a system interrupt when set. the
fifo interrupts are sticky and can likewise only be cleared inside the
handler by disabling them.
dma does this too, also with 12 bits to check. this decreases code size
significantly (increasing speed when the cache is cold), frees up an
interrupt handler, and avoids read-modify-write cycles (which makes each
processed flag cheaper). due to more iterations per handler invocation
the actual runtime of the handler body remains roughly the
same (slightly faster at O2, slightly slower at Oz).
notably wakers are now kept in one large array indexed by the irq
register bit number instead of three different arrays, this allows for
machine code-level optimizations of waker lookups.
1403: Bump versions preparing for -macros and -executor release r=lulf a=lulf
I'd like to propose a new release of embassy-macros and embassy-executor, as there is a challenge with some of the features changing since 0.1.1 when using libraries that depend on 0.1.1 with applications that patch to use git versions.
Co-authored-by: Ulf Lilleengen <lulf@redhat.com>
1406: rp: DMA behaviour during flash operations r=Dirbaio a=kalkyl
This PR changes the old behaviour during flash operations where all DMA transfers were paused during the flash operation.
The new approach is to wait for any DMA operating in flash region to finish and let RAM transfers continue.
Co-authored-by: kalkyl <henrik.alser@me.com>
storing a full function pointer initialized to a resolver trampoline
lets us avoid the runtime cost of checking whether we need to do the
initialization.
rp-hal has done this very well already, so we'll just copy their entire
impl again. only div.rs needed some massaging because our sio access
works a little differently, everything else worked as is.
1378: Add ability to invert UART pins, take 2 r=Dirbaio a=jakewins
Same PR as before, except this now works :)
There was a minor hiccup in the UartRx code where the rx pin got passed as the tx argument, so the invert settings didn't get applied. With this fix, my local setup at least is happily reading inverted uart data.
Co-authored-by: Jacob Davis-Hansson <jake@davis-hansson.com>
1372: rp: add division intrinsics r=Dirbaio a=pennae
rp2040-hal adds division intrinsics using the hardware divider unit in the SIO, as does the pico-sdk itself. using the hardware is faster than the compiler_rt implementations, and more compact too.
since embassy does not expose the hardware divider in any way (yet?) we could go even further an remove the state-saving code rp2040-hal needs, but that doesn't seem to be worth it.
Co-authored-by: pennae <github@quasiparticle.net>
rp2040-hal adds division intrinsics using the hardware divider unit in
the SIO, as does the pico-sdk itself. using the hardware is faster than
the compiler_rt implementations, and more compact too.
This is useful in some cases where the surrounding circuit
for some reason inverts the UART signal, for instance if you're talking
to a device via an optocoupler.
A while ago `OutputOpenDrain` was made to implement `InputPin`,
something that allowed drivers for various one-wire protocols to be
written, but it's been lacking a `Wait` implementation — something
that's needed to write async versions of these drivers.
This commit also adds `get_level()` to `OutputOpenDrain`, since
`is_high()` and `is_low()` were already implemented, but `get_level()`
itself was missing.
1318: rp: Allow zero len reads for buffered uart r=Dirbaio a=timokroeger
Prevents the read methods from getting stuck forever.
cc `@MathiasKoch` can you test if this fixes the problem you described in the chat?
Co-authored-by: Timo Kröger <timokroeger93@gmail.com>