DMA in one cycle? Must be tiny or a ton of dual port.

All that time slice task switching is convenient if you have cycles and stack to spare, then you don't have to think much about timing. In 8 bit controllers there's rarely that luxury. And by the time you figure the preemptive part, and the blocking part, context switching overhead, and so on, you're usually better off making it interrupt driven. Which is sort of like a RTOS except interrupts rather than the system clock control task switching and priorities, blocking, etc. are in the ISR and/or interrupt handler (ISR for each task with a handler directing the flow) rather than at system level. Eliminates a lot of overhead and there's usually not so many tasks that it's too hard to manage. Dependencies can be a gotcha though, depending on how you structure your code. If it gets to be too much, usually it's easiest to add another processor - silicon is cheap. There is a trend to turn to 16 and even 32 bit parts where an 8 bitter will do just for the extra resources but it is inelegant!

For tiny stuff C being "loose" is a virtue. It allows you to get almost as close to the iron as assembler when you need to without adding inline assembler code which can cause its own set of problems. You can still do encapsulation but it's your responsibility rather than the language's. And it's low level enough that processor specific extensions are sensible. You'll have to take up C++ with Stroustrup, I don't use it much. I've fallen into Python for what little I need on the desktop.

I still don't like the old Intel architecture though compilers hide (most of) the warts.

Time to return to your regularly scheduled program.


The key elements in human thinking are not numbers but labels of fuzzy sets. -- L. Zadeh

Which explains a lot.