FIFO and IPC [solved]
FIFO and IPC [solved]
Short version: Is there a way where I can still access 4 bytes of shared memory between ARM7 and ARM9 with the FIFOfied libnds?
Long version: I'm using a heavily modified ARM7 core for my new version of Colors!. Basically, what it does is that it with maximum speed reads input (>500Hz) from the touch screen. It uses this massive amount of data to automatically build up a statistics-table used to compensate for uneven pressure across the screen. It also resample this data down to around 60Hz creating nice stylus-jump free stylus input, that increases in frequency if the user moves very fast with elaborate movement, as well as decreases if not. I'm using a FIFO user-call to send this over to ARM9.
First off, as ARM9 sometimes can be busy, and can't receive this input for a bunch of frames, it can happen that the FIFO buffer overflows, which causes the ARM7 to lockup or crash. Is this a bug or to be expected? I worked around that by sending back ACKs from the ARM9, so that the ARM7 knows when data is received, and won't try to push data if it's already too much in the queue. There is probably a better solution, right? Can I check on ARM7 how much is in the buffer?
Secondly, while I'm using the FIFO for queuing up the input, letting the ARM9 poll and process it as quickly it can, I sometimes also want the absolutely most recent position for preview purposes, even though there is a bunch of input waiting to be processed. Dumping over the 500Hz input data to have the ARM9 discard 90% of it and just pick the latest one doesn't work (due to the queue full problem above, but is also a waste of processing). So I was hoping to be able to use 4 bytes from the IPC/TransferRegion and constantly write the most recent input into those bytes, so the ARM9 can grab it whenever it need to. With the proper FIFO introduction to libnds (which is great), that doesn't seem possible to do anymore. Or can I expose it somehow?
Finally, in case you are wondering why I'm going through these lengths to just get some input. I'm a huge fan of reducing latency, and there is a noticeable difference between 60Hz input, which gives 17ms latency (last version of Colors!), and a latency of close to 0 (new version of Colors!). (Obviously, this doesn't take how the screen's scan-lining process adds up to 17ms latency as well, but since my vsync is off, the best case is till that the latency can still be zero, even if it's just for an instant . Minimum latency has been a holy grail of mine. Getting close to 0 latency is not possible on PC, iPhone or any other platform that I have worked on, but it _is_ possible for homebrew DS. I've seen it - felt it - working, and it's wonderful
Long version: I'm using a heavily modified ARM7 core for my new version of Colors!. Basically, what it does is that it with maximum speed reads input (>500Hz) from the touch screen. It uses this massive amount of data to automatically build up a statistics-table used to compensate for uneven pressure across the screen. It also resample this data down to around 60Hz creating nice stylus-jump free stylus input, that increases in frequency if the user moves very fast with elaborate movement, as well as decreases if not. I'm using a FIFO user-call to send this over to ARM9.
First off, as ARM9 sometimes can be busy, and can't receive this input for a bunch of frames, it can happen that the FIFO buffer overflows, which causes the ARM7 to lockup or crash. Is this a bug or to be expected? I worked around that by sending back ACKs from the ARM9, so that the ARM7 knows when data is received, and won't try to push data if it's already too much in the queue. There is probably a better solution, right? Can I check on ARM7 how much is in the buffer?
Secondly, while I'm using the FIFO for queuing up the input, letting the ARM9 poll and process it as quickly it can, I sometimes also want the absolutely most recent position for preview purposes, even though there is a bunch of input waiting to be processed. Dumping over the 500Hz input data to have the ARM9 discard 90% of it and just pick the latest one doesn't work (due to the queue full problem above, but is also a waste of processing). So I was hoping to be able to use 4 bytes from the IPC/TransferRegion and constantly write the most recent input into those bytes, so the ARM9 can grab it whenever it need to. With the proper FIFO introduction to libnds (which is great), that doesn't seem possible to do anymore. Or can I expose it somehow?
Finally, in case you are wondering why I'm going through these lengths to just get some input. I'm a huge fan of reducing latency, and there is a noticeable difference between 60Hz input, which gives 17ms latency (last version of Colors!), and a latency of close to 0 (new version of Colors!). (Obviously, this doesn't take how the screen's scan-lining process adds up to 17ms latency as well, but since my vsync is off, the best case is till that the latency can still be zero, even if it's just for an instant . Minimum latency has been a holy grail of mine. Getting close to 0 latency is not possible on PC, iPhone or any other platform that I have worked on, but it _is_ possible for homebrew DS. I've seen it - felt it - working, and it's wonderful
Last edited by Jens on Sun Jun 13, 2010 9:08 pm, edited 1 time in total.
Re: FIFO and IPC
dswifi uses a shared memory region - so you can look at it for reference.
the arm9 allocates memory, maps it to the uncached region, then passes the address to the arm7 on startup.
arm9
arm7
the reponse code that I included assumes that you do not have a handler loaded on the arm9 for FIFO_USER_01. if you do install a handler for the arm9 for FIFO_USER_01 then it will not work.
the arm9 allocates memory, maps it to the uncached region, then passes the address to the arm7 on startup.
arm9
Code: Select all
fifoSendAddress(FIFO_USER_01, (void *)memUncached(mybuffer));
//optionally wait for a response
while(!fifoCheckValue32(FIFO_USER_01));
int response = fifoGetValue32(FIFO_USER_01);
Code: Select all
void myAddressHandler( void * address, void * userdata ) {
//address is the value from the arm9
//optionally send a response
fifoSendValue32(FIFO_USER_01, (u32)0); //I used 0 but it could be something more meaningful
}
//install the handler in main
fifoSetAddressHandler(FIFO_USER_01, myAddressHandler, 0);
Re: FIFO and IPC
I'm not completely sure how that works, but it does. The main problem is solved. Thanks!
I'm still interested in how to send large amounts of data without needing to send ACKs back to avoid the FIFO to overflow crash.
I'm still interested in how to send large amounts of data without needing to send ACKs back to avoid the FIFO to overflow crash.
Re: FIFO and IPC
I do not think the fifosystem in libnds can hande such large messages.Jens wrote:I'm still interested in how to send large amounts of data without needing to send ACKs back to avoid the FIFO to overflow crash.
Code: Select all
// FIFO_MAX_DATA_WORDS - maximum number of bytes that can be sent in a fifo message
#define FIFO_MAX_DATA_BYTES 128
would the latency really be that bad from a shared circular buffer? maybe even have the arm7 update the buffer at a lower frequency then it collects data?
-
- Site Admin
- Posts: 2003
- Joined: Tue Aug 09, 2005 3:21 am
- Location: UK
- Contact:
Re: FIFO and IPC
Couldn't you do the filtering on the arm7 side and send only the final filtered result plus the most recent unfiltered?
Re: FIFO and IPC
No, I don't see why a shared buffer could reduce latency. I was assuming that FIFO worked with a shared buffer under the hood. I've never bothered to check though. Either way, the latency problem was solved with the shared buffer.elhobbs wrote:would the latency really be that bad from a shared circular buffer? maybe even have the arm7 update the buffer at a lower frequency then it collects data?
Yes, that's what I tried. But unfiltered in this case means 500Hz, which is too much. I could easily just discard data down to for example 60Hz, but you might have guess that I'm not a fan of that. The shared 4 bytes works great though.Wintermute wrote:Couldn't you do the filtering on the arm7 side and send only the final filtered result plus the most recent unfiltered?
Using FIFO for the filtered 20-100Hz input did require ACKs, since some operations on ARM9 can a second or more, and if the user spams the screen at that point the buffer will overflow. Writing this now, I realize that my problem might just be that I pull the FIFO on ARM9, rather than having a callback. With a callback, I could be more or less certain that there won't be an overflow, right?
Thanks for the awesome help, by the way.
-
- Site Admin
- Posts: 2003
- Joined: Tue Aug 09, 2005 3:21 am
- Location: UK
- Contact:
Re: FIFO and IPC
Yes, using a callback will help a lot with latency, just remember that the callbacks operate in an interrupt context. I did wonder what you meant by arm9 being busy for a bunch of frames.
Re: FIFO and IPC
Just to follow up on this. Not using callbacks was my real problem. With the "ASM9 being busy" I meant that some operations that I'm doing can take up to a second, and if the user is spamming the touch-screen during that time, which added commands to the FIFO, ARM7 crashed due to the queue becoming full. Now that I'm using callbacks, I have pushed through all that I can without ever running into any problems. I'm also using the shared memory, which was a great complement to the FIFO stuff for me.
Thanks for all the help. Now I only need to free up some more memory, as I discovered today that some flashcards seem to have less memory available than others, and I really didn't have many bytes over...
Thanks for all the help. Now I only need to free up some more memory, as I discovered today that some flashcards seem to have less memory available than others, and I really didn't have many bytes over...
Re: FIFO and IPC
I have noticed this as well. It looked to me like some cards just grab a chunk of memory at the end of main ram to use as a scratch space. I never really found a good way to deal with it.Jens wrote:Now I only need to free up some more memory, as I discovered today that some flashcards seem to have less memory available than others, and I really didn't have many bytes over...
Who is online
Users browsing this forum: Ahrefs [Bot] and 4 guests