1404: feat(stm32): Add DMA based, ring-buffer based rx uart, v3 r=Dirbaio a=rmja
This PR replaces #1150. Comparing to that PR, this one has the following changes:
* The implementation now aligns with the new stm32 dma module, thanks `@Dirbaio!`
* Calls to `read()` now returns on either 1) idle line, or 2) ring buffer is at most half full. This is different from the previous pr, which would return a lot of 1 byte reads. Thank you `@chemicstry` for making me realize that it was actually not what I wanted. This is accomplished using half-transfer completed and full-transfer completed interrupts. Both seems to be supported on both dma and bdma.
The implementation still have the issue mentioned here: https://github.com/embassy-rs/embassy/pull/1150#discussion_r1094627035
Regarding the todos here: https://github.com/embassy-rs/embassy/pull/1150#issuecomment-1513905925. I have removed the exposure of ndtr from `dma::RingBuffer` to the uart so that the uart now simply calls `ringbuf::reload_position()` to align the position within the ring buffer to that of the actual running dma controller. BDMA and GPDMA is not implemented. I do not have any chips with those dma controllers, so maybe someone else should to this so that it can be tested.
The `saturate_serial` test utility inside `tests/utils` has an `--idles` switch which can be used to saturate the uart from a pc, but with random idles.
Because embassy-stm32 now can have tests, we should probably run them in ci. I do this locally to test the DmaRingBuffer: `cargo test --no-default-features --features stm32f429ig`.
cc `@chemicstry` `@Dirbaio`
Co-authored-by: Rasmus Melchior Jacobsen <rmja@laesoe.org>
Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
1376: rtc: cleanup and consolidate r=Dirbaio a=xoviat
This removes an extra file that I left in, adds an example, and consolidates the files into one 'v2' file.
Co-authored-by: xoviat <xoviat@users.noreply.github.com>
1414: rp: report errors from buffered and dma uart receives r=Dirbaio a=pennae
neither of these reported errors so far, which is not ideal. add error reporting to both of them that matches the blocking error reporting as closely as is feasible, even allowing partial receives from buffered uarts before errors are reported where they would have been by the blocking code. dma transfers don't do this, if an errors applies to any byte in a transfer the entire transfer is nuked (though we probably could report how many bytes have been transferred).
Co-authored-by: pennae <github@quasiparticle.net>
this reports errors at the same location the blocking uart would, which
works out to being mostly exact (except in the case of overruns, where
one extra character is dropped). this is actually easier than going
nuclear in the case of errors and nuking both the buffer contents and
the rx fifo, both of which are things we'd have to do in addition to
what's added here, and neither are needed for correctness.
sending break conditions is necessary to implement some protocols, and
the hardware supports this natively. we do have to make sure that we
don't assert a break condition while the uart is busy though, otherwise
the break may be inserted before the last character in the tx fifo.
1395: rp/pio: bit of a rework r=Dirbaio a=pennae
the pio module is currently in a Bit of a State. this is far from all that's needed to make it more useful, but it's a start.
Co-authored-by: pennae <github@quasiparticle.net>
the rp uart receive fifo is 32 entries deep, so the 31 byte test data
fits into it without needing any buffering. extend to 48 bytes to fill
the entire fifo and the 16 byte test buffer.
instruction memory is a shared resource. writing it only from PioCommon
clarifies this, and perhaps makes it more obvious that multiple state
machines can share the same instructions.
this also allows *freeing* of instruction memory to reprogram the
system, although this interface is not entirely safe yet. it's safe in
the sense rusts understands things, but state machines may misbehave if
their instruction memory is freed and rewritten while they are running.
fixing this is out of scope for now since it requires some larger
changes to how state machines are handled. the interface provided
currently is already unsafe in that it lets people execute instruction
memory that has never been written, so this isn't much of a drawback for now.
pin and irq operations affect the entire pio block. with pins this is
not very problematic since pins themselves are resources, but irqs are
not treated like that and can thus interfere across state machines. the
ability to wait for an irq on a state machine is kept to make
synchronization with user code easier, and since we can't inspect loaded
programs at build time we wouldn't gain much from disallowing waits from
state machines anyway.
this mainly removes the need for explicit indexing to get the pac
object. runtime effect is zero, but arguably things are a bit easier to
read with less indexing.
this is already done during platform init. it wasn't even sound in the
original implementation because futures would meddle with the nvic in
critical sections, while another (interrupt) executor could meddle with
the nvic without critical sections here. it is only accidentally sound
now and only if irq1 of both pios isn't used by user code. luckily the
worst we can expect to happen is interrupt priorities being set wrong,
but wrong is wrong is wrong.
since we never actually *disable* these interrupts for any length of
time we can simply enable them globally. we also initialize all pio
interrupt flags to not cause system interrupts since state machine
irqa are not necessarily meant to cause a system interrupt when set. the
fifo interrupts are sticky and can likewise only be cleared inside the
handler by disabling them.
dma does this too, also with 12 bits to check. this decreases code size
significantly (increasing speed when the cache is cold), frees up an
interrupt handler, and avoids read-modify-write cycles (which makes each
processed flag cheaper). due to more iterations per handler invocation
the actual runtime of the handler body remains roughly the
same (slightly faster at O2, slightly slower at Oz).
notably wakers are now kept in one large array indexed by the irq
register bit number instead of three different arrays, this allows for
machine code-level optimizations of waker lookups.