Unsurprisingly I fell into the void of work while developing software for the TechEdSat spacecraft, which explains the lack of posts on my blog. One of the things that tends to set spacecraft software apart from regular terrestrial software is that typically you don’t get a second chance to fix things in orbit. Sure NASA APR’s require you to have this functionality, but most small spacecraft such as TechEdSat get a waiver for the requirement as it is usually out of the scope of a CubeSat mission.
TechEdSat was deployed from the ISS with a plethora of software problems, mostly stemming from the incomplete nature of the project as we approached our delivery deadline. An interesting one I discovered today prompted me to write this blog entry on the discovery and diagnosis of one such error.
I was decoding a packet received from JA1GDE, a HAM from Tokyo, Japan, when I discovered a small discrepancy in the timekeeping of the spacecraft. There is only one timer on TechEdSat; the Spacecraft Elapsed Timer (or SCET), however time is stored in two locations in the beacon packet in two different formats. The SCET itself is stored as seconds elapsed in the first 8 hex ASCII characters (representing 4 bytes) after the 10 digit website header (ncasst.org). The second storage of time is the “NON_Minutes”, which represents the time the spacecraft has spent during Nominal mode of operations in minutes, stored as 4 hex ASCII characters (2 bytes). Due to last minute changes in the spacecraft, the processor should never enter “safe mode”, thereby resetting the “NON_Minutes” counter, so its essentially the elapsed minutes timer. Intuitively one should think that the value of NON_Minutes should be exactly SCET/60.
Let’s look at the actual numbers:
SCETTime 8 002a9029 2789417
NON_minutes 4 b4cf 46287
In this example, the SCET shows 2,789,417 seconds and the NON_Minutes timer shows 46,287 minutes. SCET/60 = 46,490, or a difference of 203 minutes. I found this interesting, and made a comment to a coworker who sits next to me. We thought about what might cause the error, and started by looking at the number and what order of magnitude it was relative to other metrics in the spacecraft. We stumbled on an interesting relation:
The difference in time, 203 minutes, turns out to be 12,180 seconds. This number is within 5% of (SCET/60) divided by 4. When we realized this exactly represented a 250ms delay every time the NON_Minute timer ticked up one, we instantly realized where I had made an error: the spacecraft main loop operates on a 4Hz cycle, for some arbitrary reason that no one can seem to remember.
delay_ms(250); // Cycle currently operates at 4Hz
The above line of code is the offender. Because the main loop waits 250 ms at the end of every cycle, the update of the NON_Minutes timer is always 250ms late. This time difference when compounded over the 32 days of the mission elapsed so far produces a drift in the NON_Minutes timer.
I’m taking a note of this as “lesson learned” and will not adopt a similar scheme for the spacecraft executive on my next spacecraft.