All what should become a great mess later started with a single bug report some time back:
“[SlowAudio] HTML5 Live stream has bad sound and video”
(dramatic music effect)
found by Gwendragon, who in the following events tested, logged and categorized a lot of audio stuff too.
This bug happened only on a single site, which is well frequented in one country, but otherwise rather irrelevant.
Back then we could only check if it works for us – which it did for those who tested it after spoofing the user agent – so this bug was set to “cannot reproduce” (which means: Closed) after contacting the site owner because at that time it appeared to be a problem for the site to fix.
Usually this is the end of the story, the developers can only fix what they see and what they can analyze, and if it works for the testers and the developers, it can’t be broken, right?
Somehow we managed to forget one operating system (OS) during the check, which was partly my fault, because usually I use that OS while testing.
All was good until the next report came in for exactly the same site. We checked it again (usual procedure, no bug gets closed unchecked), this time with the right OS too and the result was: [insert some heavy cussing and swearing here – be creative, ’cause I was]!
Not only that – suddenly reports about dozens of other sites started popping up!
Well, no panic, but no real solution in sight either, which is quite frustrating because at that time all of the developers, who can fix this kind of bugs, were heavily burdened with really important stuff like frozen UI or crashes, which of course have a higher priority, especially because super traffic-heavy sites like YT, Vimeo etc. worked well – but of course those audio bugs were constantly nagging in the background.
Then, one day, it was finally possible to assign a developer to the problem, but the fix was everything but easy.
One of the first things was to find out where it breaks and what exactly breaks – which sounds easy enough but isn’t, especially if you have to deal with the codecs inside of the OS because prohibitive license fees from the MPEG-LA and the Fraunhofer institute forbid to bundle the full ffmpeg+aac codec in binary form to Vivaldi, no thanks to them.
(Linux users might have noticed by now, that they are not hit by this because they can
steal re-purpose the original Chrome codecs hosted in the usual repositories – a luxury the owners of other OSs don’t have)
The first thing to do was to add some sophisticated debugging code to the internal builds which can provide log files for the sites that were affected by the issue.
Said and done, we testers (both the employed and we volunteers) went through all of the bugs and created tons of big log files which we dumped on the poor Developer. After some time some of us became experienced enough to identify some common problems so that we could “Duplicate” bugs (meaning link together bugs that have the identical cause) on some kind of master bugs (usually the bug where the exact problem was first seen).
One thing all of those bugs had in common was the interaction between Vivaldi and the OS codecs. First Vivaldi needed to parse out the correct information from the media stream to get the correct settings for the audio decoder, then it needed to deliver it to the OS in exactly the right way or everything would break, which it still did at that time. This meant digging deep into the OS – one of the developers even went so far to read through The Windows Bible (a set of 1k+ page books that describe the internal workings in detail) to find “the right MS way(TM)” to do that.
This resulted in the first set of patches, or, to be more precise: One single patch which fixed a lot of the problems that existed.
(Here my part of hero worshiping: Patricia did this patch “blind” – i.e. without working on the OS directly! And it worked on the first try! That is quite a huge feat!)
End of story … not …
… because this fixed only the most visible bugs, or audible, in this case.
There was a different bunch of similar bugs – which audibly sounded the same but internally weren’t – we saw that they were there but at that point we could not set see what exactly caused it, so back to writing log files, but this time with the improved “media bug logger 3000(TM)” to get even more data and again we dumped it on the developer:
Set me on the watch-list for ALL of those bugs! (slightly misquoted, but that was the essence of what Patricia said)
Those bugs were related with the way the servers send the data. To get a grip on those, the log files alone were not enough, because they told us only what the browser thinks it has seen – which might not be and was not always the same as is really there.
Side note: To understand that one must know that the involved AAC files can be delivered in multiple ways: With different profiles, with the profile announcing what is inside (the easier case), some with implicit announcement (which means we need to look inside of the data), some without any headers at all and some of those even encrypted (more about those later).
We had to find a way to get the raw “files” as delivered by the server. Sadly they are not delivered as files, but in little chunks that temporarily exist in the RAM. Downloading the requested data with the dev tools was no option1, so Tarquin came up with the genius idea to grab the stuff with an extension (no, even that will not give you a working file, don’t bother to ask for a downloader. I can hear your thoughts from here, the extension doesn’t work that way 😛 ) and finally we were able to “download” small bunches of the raw data, but when we tried to look into those, a lot of what was inside looked like a garbled mess. We are no audio decoding experts, usually the codecs “just do it” and we all hope they do it right, so we definitely needed to wise up, quick.
At some point, when I got totally frustrated about that pile of horse manure some servers delivered, I remembered that I have a friend who “is into that codec stuff”. Usually SagaraS works with video codecs and does all kinds of fancy things with them to improve the output quality but hey, it was worth a shot. So I dumped some of the grabbed files we couldn’t analyze at that point on him and asked him to find out what all of that stuff means. It took him some hours or so to get a grip on the audio stuff. When he tried to explain to me what it does and how we can analyze it – in purest AV Geek language 2, it could have been Swahili and I couldn’t have told the difference – my mind boggled. In the end we set up a teamviewer session the next day so that I could look over his shoulder while he showed me what to watch out for and how to do it. It wasn’t that easy, he is armed with an Hex Editor and knows how to use it, but he gave us the necessary clues. Additionally he pointed us to tools we can use to analyze that stuff, which was of great help too.
So back to logging and grabbing and analyzing it was, while the developers started fixing the next bunch of issues, this time in a teamed effort between Patricia and Julien because of the sheer amount, resulting in what you now know as More HTML5 audio fixes – Vivaldi Browser snapshot 1.15.1147.23.
End of story? Not quite …
Some web pages strictly resisted to give us meaningful content to grab with the above mentioned extension: All we got was seriously garbled junk.
Oh, btw: While doing all of that he wrote a gigantic test suite for all kinds of audio issues with several hundred tests. Yes, Vivaldi still fails some of them (but so do other browsers too, interestingly not all on the same tests) but it is still a work in progress, so stay tuned and watch the snapshot blog and the team blog 🙂
1) You can find it in the network tab of the developer tools. No need to try, it won’t give you a working file for streams.
2) Not really geek language, he isn’t like that – but understanding the correct technical terms is quite hard if you don’t have much expertise in an area. Seeing him pointing at the code blocks while he was explaining the detail was much easier (for me) than only getting a description of it.
Here the usual disclaimer:
This blog post reflects my personal opinion and views. For brevity (and dramatic reasons) I left out some of the boring parts, but everything else happened more or less as described here and as good as I could memorize what has happened. All errors and mistakes are mine and nothing in here is binding Vivaldi ASA or any person I mentioned in any way. It is only my personal, private view of the events I took part in as a “Soprano”. I have asked for permission to publish this (’cause NDA, ya know) to give a small look behind the scene because I thought it might be interesting for some of you, or maybe even entertaining, and I can only “Thank You!” for granting the request.