I recently had the pleasure of debugging HyperTerminal on PortICA. It is going to take some time to put together the story of how things turned out but I wanted to start by saying something obvious.
If you are going to use HyperTerminal on Windows XP (since Vista does not have it), be sure to never turn the cursor blink off. HyperTerminal has a built in bug where the -1 setting (which means don’t turn off the cursor ever) makes it think that it needs to constantly redraw the cursor (caret) in its message processing loop.
The cost was high to determine this. It turns out that it was first reported in 2001 and apparently discovered in Windows 2000. It was later reported in 2005 as well. Not that it is a common problem since it is hard to find on the Internet. It must be one of those kind of things that people just avoid.
I was naive enough not to know that this was the real problem and was instead focusing on the COM port angle.
The symptoms were pretty drastic. Once started with a session, HyperTerminal would freeze. It didn’t seem to matter how many different combinations were tried.
It is more common for the cursor blink to be turned off for remote sessions since there is a desire to reduce bandwidth for what would normally be considered pointless. The cursor blink traffic can be chatty and turning it off makes things look better for the network.
Historically this was very important for WinFrame since having a server with lots of users and flashing cursors leads to an annoying amount of traffic that would be difficult to support over a modem.
Moving forward to today and it is much less of a concern but would still be a problem if a massive amount of users are using the same cluster of servers. I don’t know of any recent studies on this.
The debug sessions started with using WinDbg into Windows XP with a debug PICASER.SYS . This helped to find a problem but not the problem that I cared about. Fixing this and moving on, I needed to switch to different strategies since HyperTerminal was making no active requests to the Citrix serial driver. All it had was an outstanding read request. This looked like it was working fine.
At some point I decided to debug HyperTerminal directly. This showed that it was looping in the first thread in the message loop. Further work showed that once it received a certain message, it would get stuck. Later on it became more obvious that callbacks were being used related to tick count differences. At this point I asked for help internally and Michael Wookey showed me how to use IDA to look at these kind of problems. IDA is much more talented at figuring things out than WinDbg can. It sped up understanding much quicker.
By now, it was proven that it was stuck processing timer callbacks. I started mapping out the data structures being referred and a pattern emerged. I could describe each structure used and how they were connected. The loop in question was checking values that made sure it was always looping more.
One of the key finds was that it had a relative delay of 0xffffffff (-1). This was bad since adding -1 to the current tick count would always come up with a value one less than the current value. This meant the loop would continue to dispatch forever since the event would never be waited for and always executed. Zero was okay since it would purposely delay for 1 millisecond. The core bug is that it is not doing overflow checking. The rollover is ignored and there seems to be some confusion on the special meaning for -1.
The final step was to find where this came from. I set debug registers on the setting of the memory location that held 0xffffffff. I tracked this back a few times until the holy grail was found. HyperTerminal does a call to GetCaretBlinkTime which sets this value. GetCaretBlinkTime is fed by SetCaretBlinkTime. SetCaretBlinkTime was being set with -1 by PortICA for the sake of turning off cursor blinking during the session.
It took roughly a week to clear this one up.
Frustration point #1: Microsoft did not provide any symbols for HyperTerminal even though other things in XP are covered.
Frustration point #2: Microsoft never fixed this obvious bug
Frustration point #3: I wish I had known the answer before I spent that week 🙂
Oh well. Chalk it up to increased debugging experience. I used to love this kind of stuff. Now that I’m older I don’t like diving in quite as fast. It’s draining to understand what only the coders knew for sure. It does feel good that I can pull something like this off still.
The fix will be coming soon to PortICA.
Any opinions on cursor flashing on a remote session? Like it? Hate it? Faster? Slower? Not at all?
Your opinion could actually have an influence with how this falls.