The story so far
You may remember that things had been going well, swimmingly well... ah well.
And here we go
So on Sunday we rolled out a few settops around the place and all of a sudden we start getting issues. And on Monday the executive private jet turns up replete with executives and they start complaining about stalled images on WNBC, Those with good memories will recall that during testing in New Jersey the channel that gave the most trouble was WNBC.
Same as it ever was
Cap deaf
Not a new hip hop act, but instead two words that send dread through anyone who remembers issues with DirecTV where the card would refuse to accept messages. And because it would not accept messages, you couldn't tell it to reboot...
We we have the same problem. Not, luckily for us in the settop, but in the system. The problem being that every once in a while the picture would freeze, only to come back some two and a half minutes later. Doesn't that sound like cap deaf. I said, DOESN'T THAT SOUND LIKE CAP DEAF. HELLO?
But now we come to tracing the issue, which took until 4:00am on, well today really, to track down.
So the issue is that for multicast, the routers all send messages to each other to make sure that someone is actually watching the pictures, those that know will liken this to Switched Digital Video. The source router sends - and I've been doing the research hence the 4:00am - an IGMPv2 General Membership Query. We have, as specified in the message, 10 seconds to comply. On occasions we take 9.998 seconds to respond. And when you look at the chaining of the message there's a good chance the origin router times us out and stops the broadcast. Damn you, damn you to Heck...
Two and a half minutes later another request comes in, and this time we reply in a more timely manner - the spec says we should take a random time up to the maximum indicated - and the stream restarts. Pictures are restored and off we go.
So the solution is simple right? We just hack our code to respond, y'know a bit early. So we did; you tell us 10 seconds to comply we'll comply in 5. Give us a second and we'll take half. Fixed, right?
Wrong. Still dropping.
So tonight it was the turn of the router team to get to watch the sun come up on the way home. _This is a lie, the sunrise is so damn' late they got to travel home in the dark..._
Although luckily, not totally in the dark as to what was going on. Seems that between the layer two and layer three protocols these are you watching messages are getting lost. And so although we say yes, the middleman kindly ignores this and tells the boss no. And away goes the stream. Only on the next request does the response make it all the way back and we get a good flow of the stream.
The router development team are now investigating how to fix this properly, having given us a workaround to at least allow folk to watch TV uninterrupted.
In other news
Ian arrived on Tuesday. And this place, like the national debt, has a size and scale impossible to understand until you see it. When a tour takes 15 minutes you know it's a big place and on walking home we passed the BBC operation with the doors open you understand they have a few desks, a computer with I swear Windows for Workgroups 3.1, a Flip video camera - that's who bought it - and a dog. This it seems is all you need. According to the BBC, amateurs.
We had another fire alarm, it was at 4:00pm. OK. 4:00pm. There are signs all over the area - 4:00pm. So why we're standing outside at 12:30 no-one knows. And once again those that were eating, all of a sudden weren't. And I got to break the security seal on the fire doors. Which was strangely satisfying.
Apneic Jay continues to provide us with insights into the American mind. It's a wilderness in there, it is.
Anyway normal* service tomorrow, probably.
* For a given value of normal...
No comments:
Post a Comment