Hunting Down Audio Bugs – a view from behind the scenes

All what should become a great mess later started with a single bug report some time back:

“[SlowAudio] HTML5 Live stream has bad sound and video”

(dramatic music effect)
found by Gwendragon, who in the following events tested, logged and categorized a lot of audio stuff too.

This bug happened only on a single site, which is well frequented in one country, but otherwise rather irrelevant.
Back then we could only check if it works for us – which it did for those who tested it after spoofing the user agent – so this bug was set to “cannot reproduce” (which means: Closed) after contacting the site owner because at that time it appeared to be a problem for the site to fix.

Usually this is the end of the story, the developers can only fix what they see and what they can analyze, and if it works for the testers and the developers, it can’t be broken, right?

Sadly not.

Somehow we managed to forget one operating system (OS) during the check, which was partly my fault, because usually I use that OS while testing.
All was good until the next report came in for exactly the same site. We checked it again (usual procedure, no bug gets closed unchecked), this time with the right OS too and the result was: [insert some heavy cussing and swearing here – be creative, ’cause I was]!

Not only that – suddenly reports about dozens of other sites started popping up!

PANIC!!!

Well, no panic, but no real solution in sight either, which is quite frustrating because at that time all of the developers, who can fix this kind of bugs, were heavily burdened with really important stuff like frozen UI or crashes, which of course have a higher priority, especially because super traffic-heavy sites like YT, Vimeo etc. worked well – but of course those audio bugs were constantly nagging in the background.

Then, one day, it was finally possible to assign a developer to the problem, but the fix was everything but easy.

One of the first things was to find out where it breaks and what exactly breaks – which sounds easy enough but isn’t, especially if you have to deal with the codecs inside of the OS because prohibitive license fees from the MPEG-LA and the Fraunhofer institute forbid to bundle the full ffmpeg+aac codec in binary form to Vivaldi, no thanks to them.
(Linux users might have noticed by now, that they are not hit by this because they can steal re-purpose the original Chrome codecs hosted in the usual repositories – a luxury the owners of other OSs don’t have)

The first thing to do was to add some sophisticated debugging code to the internal builds which can provide log files for the sites that were affected by the issue.
Said and done, we testers (both the employed and we volunteers) went through all of the bugs and created tons of big log files which we dumped on the poor Developer. After some time some of us became experienced enough to identify some common problems so that we could “Duplicate” bugs (meaning link together bugs that have the identical cause) on some kind of master bugs (usually the bug where the exact problem was first seen).

One thing all of those bugs had in common was the interaction between Vivaldi and the OS codecs. First Vivaldi needed to parse out the correct information from the media stream to get the correct settings for the audio decoder, then it needed to deliver it to the OS in exactly the right way or everything would break, which it still did at that time. This meant digging deep into the OS – one of the developers even went so far to read through The Windows Bible (a set of 1k+ page books that describe the internal workings in detail) to find “the right MS way(TM)” to do that.

This resulted in the first set of patches, or, to be more precise: One single patch which fixed a lot of the problems that existed.
(Here my part of hero worshiping: Patricia did this patch “blind” – i.e. without working on the OS directly! And it worked on the first try! That is quite a huge feat!)

End of story … not …

… because this fixed only the most visible bugs, or audible, in this case.

There was a different bunch of similar bugs – which audibly sounded the same but internally weren’t – we saw that they were there but at that point we could not set see what exactly caused it, so back to writing log files, but this time with the improved “media bug logger 3000(TM)” to get even more data and again we dumped it on the developer: Set me on the watch-list for ALL of those bugs! (slightly misquoted, but that was the essence of what Patricia said)

Those bugs were related with the way the servers send the data. To get a grip on those, the log files alone were not enough, because they told us only what the browser thinks it has seen – which might not be and was not always the same as is really there.

Side note: To understand that one must know that the involved AAC files can be delivered in multiple ways: With different profiles, with the profile announcing what is inside (the easier case), some with implicit announcement (which means we need to look inside of the data), some without any headers at all and some of those even encrypted (more about those later).

We had to find a way to get the raw “files” as delivered by the server. Sadly they are not delivered as files, but in little chunks that temporarily exist in the RAM. Downloading the requested data with the dev tools was no option1, so Tarquin came up with the genius idea to grab the stuff with an extension (no, even that will not give you a working file, don’t bother to ask for a downloader. I can hear your thoughts from here, the extension doesn’t work that way 😛 ) and finally we were able to “download” small bunches of the raw data, but when we tried to look into those, a lot of what was inside looked like a garbled mess. We are no audio decoding experts, usually the codecs “just do it” and we all hope they do it right, so we definitely needed to wise up, quick.

At some point, when I got totally frustrated about that pile of horse manure some servers delivered, I remembered that I have a friend who “is into that codec stuff”. Usually SagaraS works with video codecs and does all kinds of fancy things with them to improve the output quality but hey, it was worth a shot. So I dumped some of the grabbed files we couldn’t analyze at that point on him and asked him to find out what all of that stuff means. It took him some hours or so to get a grip on the audio stuff. When he tried to explain to me what it does and how we can analyze it – in purest AV Geek language 2, it could have been Swahili and I couldn’t have told the difference – my mind boggled. In the end we set up a teamviewer session the next day so that I could look over his shoulder while he showed me what to watch out for and how to do it. It wasn’t that easy, he is armed with an Hex Editor and knows how to use it, but he gave us the necessary clues. Additionally he pointed us to tools we can use to analyze that stuff, which was of great help too.

So back to logging and grabbing and analyzing it was, while the developers started fixing the next bunch of issues, this time in a teamed effort between Patricia and Julien because of the sheer amount, resulting in what you now know as More HTML5 audio fixes – Vivaldi Browser snapshot 1.15.1147.23.

End of story? Not quite …

Some web pages strictly resisted to give us meaningful content to grab with the above mentioned extension: All we got was seriously garbled junk.

In the end we had to dive deep into the code of the website player – cough, not really “we”, the developer of the extension. I could only confirm from the code (which was a PITA of its own: >60k lines of JavaScript is a lot to look through) that they encrypt the stuff on the server and decrypt it in the browser on the fly and dynamically add the header to it, but not what it exactly does. Finally the Tarquin found the exact place where to grab the necessary information needed for the decryption module, so be prepared for the next bunch of fixes.

Oh, btw: While doing all of that he wrote a gigantic test suite for all kinds of audio issues with several hundred tests. Yes, Vivaldi still fails some of them (but so do other browsers too, interestingly not all on the same tests) but it is still a work in progress, so stay tuned and watch the snapshot blog and the team blog 🙂


1) You can find it in the network tab of the developer tools. No need to try, it won’t give you a working file for streams.
2) Not really geek language, he isn’t like that – but understanding the correct technical terms is quite hard if you don’t have much expertise in an area. Seeing him pointing at the code blocks while he was explaining the detail was much easier (for me) than only getting a description of it.


Here the usual disclaimer:
This blog post reflects my personal opinion and views. For brevity (and dramatic reasons) I left out some of the boring parts, but everything else happened more or less as described here and as good as I could memorize what has happened. All errors and mistakes are mine and nothing in here is binding Vivaldi ASA or any person I mentioned in any way. It is only my personal, private view of the events I took part in as a “Soprano”. I have asked for permission to publish this (’cause NDA, ya know) to give a small look behind the scene because I thought it might be interesting for some of you, or maybe even entertaining, and I can only “Thank You!” for granting the request.

Rest in Peace, KartOO

Back then …

Whenever I knew what information I needed but didn’t know which search terms would lead me to this information, you were there and helped me reliably.

I only needed to enter a search term and you showed me a cloud of pictures with lines in between, showing the interlinks with added keywords connecting these pictures. You allowed to hover over one of the pictures to get a short summary of the content of the page.

I could click on one of the terms on the connecting lines and you provided me with a new, refined cloud until there were only 10 or 20 pages left, all of which were filled with very good results.

If I clicked on one of the preview pictures, you opened the page in a new tab in the browser and flagged that I have visited the page in your search window.

Also, it was possible to set which additional information I wanted to have and you even showed me groups of possible search results…

Unfortunately, you are no longer.

Why?

I have no idea.

Maybe you were just too innovative for the “I don’t have time” searchers who just want quick information tidbits, 1,000,000 hits in 0.23s, who are not interested in the quality of the results.

But maybe you were just too unknown and therefore didn’t get any advertising revenue – although I did a lot of advertising for you…

… but what’s a single voice against the overwhelming advertising power of the data kraken?

The above eulogy was originally posted February 12, 2010 on my.opera, it is still available in the Web Archive.

The first part of this article is a translated and slightly rewritten version – I am still moaning about its too early demise. I wonder if we will ever see a public search engine like this again.

… and today

Advertising money is what makes the web spin. Cases like the above, where new ideas and technologies were starved to death, maybe without noticing or maybe even deliberately, are still a thing, maybe even more than ever, as we can see with e.g. www.foundem.co.uk which is a vertical search engine. The technology and mathematics behind vertical search engines like those is really amazing, and they can solve problems the usual horizontal search engine suspects like Google, bing etc.pp. can’t, but have you ever heard about it? No? I guessed so …

The story behind it?

Read it in this New York Times Magazine article: The case against Google.

Also available in the waybackmachine. Make sure to press the stop button as soon as you see the article content to avoid the paywall redirection on the archived page.

Almost the same happened some years earlier to Cuil – which ironically was founded by some former Google employees.

Side note: Ignore the “criticism” part in the Wikipedia article – Cuil was still new and all new search engines I have ever seen had the same problems and much worse in their early days – yes, Google back then in 1997 too. Been there, seen that.

Not only did they have their own crawler that maintained a gigantic database of web pages, which at that time was about the same size as the ones of Google, Microsoft Search (now bing) and Altavista (while it was still a thing) combined, but beside the usual search result page they had a really interesting and working approach to make results actually digestible:

It was called “cpedia” but in opposition to the well known Wikipedia, it was based purely on their data mining agents which collected information all over the web and – now the really interesting part – wrote executive summaries (backed up with a links so that you could check the sources) in a length of about 1000 words, which were astonishingly well “written”, often almost indistinguishable to the work of a human author. If possible they backed up the results with meta data, such as statistics or analytic tools, linked to related videos or images too. This was a great way to get a cursory overview over almost everything you could imagine and sometimes even showed astonishing insights that were really hard to come by otherwise.

What happened to them?

Dead after a few rounds of fund raising and sold to Google (If you can’t squash them, buy them?) and because the advertising marked at that time was already taken over by Google, so this kind of business model was no longer viable.

The odyssey of finding a certain video

This is only a short rant about the stupidity of a bunch of search engines that basically all use the same search database.

I knew that I had once seen a video of Brendan Eich (more than a year ago) that dealt with classes in JavaScript, especially ES6 and I wanted to watch it again because I had some hunch that in that video he addressed a problem I have right now.

(Sometimes my brain works in weird ways – I usually don’t remember the things I have watched a long time ago, only that I have and roughly about where I have seen them)

I don’t keep a ginormous history in Vivaldi because I have noticed that too many history results might lead to a slow down of the browser under certain circumstances and especially because all kinds of C**P are stored in the history (e.g. search engine result pages from yesterday – which will definitely not give the same results today), so I had to find that video again. I still knew that I didn’t watch it on YouTube, but somewhere else, so I started a search at the usual non-Google-ish search engines first.

A standard search “classes in javascript ES6” gave almost exclusively YT results (yes, I forgot to add “Brendan Eich” – didn’t remember who gave the talk, could have been Douglas Crockford too, so what?) so I looked up the parameters of my favorite search engines and tried to exclude YT from the results pages by adding some.
(Yes I know that it was still a very unrefined search – but hey – I’ve got results 😉 )

All links to the result pages open in a new tab.

DuckDuckGo:

startpage:

Ecosia:

bing:

And finally I added Google as a “control group”:

I leave the results open to interpretation by you, maybe you get different results, or maybe you can tell me what was wrong with my searches – but I found some search engines lacking.

How To add Search Engines that use POST to Vivaldi

There is a long standing problem with Vivaldi not adding every kind of search engines, e.g. the current implementation still does not add search engines like SearX

This happens because those engines use POST instead of GET in their search forms and Vivaldi does not (yet) check for that.

To mitigate the problem you can use the following bookmarklet:
javascript:(function()%7Bvar%20x,i%3Bx%3Ddocument.forms%3Bfor(i%3D0%3Bi%3Cx.length%3B++i)x%5Bi%5D.method%3D%22get%22%3Balert(%22Changed%20%22%20+%20x.length%20+%20%22%20forms%20to%20use%20the%20GET%20method.%5Cn%5CnDON'T%20USE%20THE%20ALTERED%20PAGE%20NOW!%5Cn%5Cn%20-%20Right-click%20in%20the%20search%20field%5Cn%20-%20select%20%5C%22Add%20as%20search%20engine%5C%22%5Cn%20-%20Open%20%5C%22Settings%5C%22%20%3E%20%5C%22Search%5C%22%5Cn%20-%20Click%20the%20edit%20button%20for%20the%20new%20search%5Cn%20-%20Check%20%5C%22Use%20Post%5C%22%5Cn%5CnEnjoy!%22)%3B%7D)()%3B

  • Create a new bookmark in the panel and copy/paste the line above in the URL field of the Bookmarks panel.
  • Give it a name like e.g. “GET the POST search”.
  • Optional add a shortcut like “getit” too.

Now you can go to one of the search engine pages linked above and either click on the bookmarklet or type the shortcut in the URL bar and hit enter.

An alert-box with additional instructions will pop up. Follow the instructions.

Warning: The site will be altered by the bookmarklet. Don’t use it for a search before you have reloaded the page or else it might break!

How it works

The bookmarklet changes all FORM elements on the page to use GET instead of POST. This allows right click > “Add as Search Engine” to detect it and enter the corresponding values it needs. Don’t forget to set the “Use POST Method” checkbox for the newly created search in “Vivaldi > Settings > Search”.

Here the decoded and unminified “source code”:

javascript: (function () {
    var x, i;
    x = document.forms;
    for (i = 0; i < x.length; ++i) {
        x[i].method = "get"; 
    }
        alert("Changed " + x.length + " forms to use the GET method.\n\nDON'T USE THE ALTERED PAGE NOW!\n\n - Right-click in the search field\n - select \"Add as search engine\"\n - Open \"Settings\" > \"Search\"\n - Click the edit button for the new search\n - Check \"Use Post Method\"\n\nEnjoy!");
})();

In the Name of Accessibility: Check Your Alt Attributes!

Occasionally I am browsing the net with images switched off because I am only interested in the text and don’t want to download gigantic amounts of decorative images – especially since several pages started to use HDPI images which are 4 times the size needed and thus HUGE downloads – and I noticed that seemingly some authors are not aware that they are missing an important accessibility feature, that exists since about the web was invented:

The alt attribute on images.

While people who don’t switch off image loading or can see well and don’t need to use assistive technologies like e.g. screen readers to get hold of the content, users who don’t or can’t display images for various reasons are at loss in such a situation, so please, dear web developers, check if all of your non-decorative images have those alt attributes.

An easy way is to temporarily overwrite the page styles with the following stylesheet:

html::before {
    content: "All visible images are either missing alt attributes or the alt attributes don't follow the specification!" !important;
    font-family: sans-serif !important;
    font-size: 20px !important;
    font-weight: bold !important;
    text-align: center !important;
    color: #F8F8F8 !important;
    background-color: #EE3333 !important;
    padding: 5px 0 !important;
    margin: 0 !important;
    border: none !important;
    width: 100% !important;
    float: none !important;
    position: static !important;
    display: inline-block !important;
}
html {
    color: #202020 !important;
    background-color: #F8F8F8 !important;
    padding: 0 !important;
    margin: 0 !important;
    border: none !important;
    width: auto !important;
    max-width: none !important;
    display: block !important;
}
html * {
    visibility: hidden !important;
}
html area,
html img {
    visibility: visible !important;
}
html area[alt],
html img[alt] {
    visibility: hidden !important;
}
html area[alt="*"],
html area[alt=""],
html img[alt="*"],
html img[alt=""],
html img[alt^=" "] {
    outline: 4px solid rgb(51, 102, 204) !important;
    visibility: visible !important;
}

You might need to adapt it if you use a Styling Extension, this is written for direct use in Vivaldi.

In Vivaldi you can add this as page action. Just save it as Image_Alt_Debugger.css to [path to]\Application\#.#.###.##\resources\vivaldi\user_files and after the next browser restart you will see the Entry “Image Alt Debugger” in the list of page actions. (Hint: Underscores in the name get replaced by Spaces for display)

HTH 🙂

PS: Sadly you have to do that for each and every update of Vivaldi because or some unknown reason they put this stuff into the Application directory instead of the User Data directory.

Why Vivaldi’s Reader (and others) sometimes don’t do what you expect

TL;DR

Vivaldi Read Mode was was never meant for pages like Twitter, Youtube comments, or Facebook chats but for articles

As general rule of thumb you can assume that everything that has a continuous or adjacent text chunk of more than 300 characters counts as content. It may be split in multiple adjacent paragraphs, but must look content-y enough (i.e. consist of sentences) and may not contain too many links, videos or images inside of the content area (outside is fine) and does not belong to one of the “stop” classes and IDs like e.g. “comment”, “footer”, “ad”-vertisement and many others.

/TL;DR

Still there?

History

The Vivaldi Read View is – like the read view in Mozilla Firefox and the Apple Safari – based on the Readability(TM) code that was released as a labs experiment by arc90 in about 2009 under an open source license (some versions under MIT, some under Apache license). Later arc90 changed it to a server supported version that is available at readability.com

The Intention Behind It

Readability was never meant to be an ad-blocker, but always as a on-demand reader view to switch on for *articles*, meaning: Longer passages of text (important!)

It was never intended to be used on pages like Facebook, with its gazillions of short text snippets, Youtube video comments, Twitter feeds and generally not on any page that does not contain a sizable longer chunk of text in one article.

It was meant to make reading of longer texts distraction free by removing e.g. advertisements, page navigation, comments and videos or images that don’t belong to the main article content, and to re-style it with readable fonts and colors to make reading more pleasurable. 

How?!

Of course the code is not really “intelligent” (it has to be fast and may not use up too many resources), so it has to trust on some kind of heuristics to detect where the main content might be. While generally it works quite well, it may fail on some pages, especially “if the HTML is totally hosed” (not my words, that was a comment of one of the original arc90 developers)

A (simplified and not complete) Explanation:

First steps:

  • Remove all scripts.
  • Remove all styles.
  • Ignore HTML block level elements like paragraphs and divisions with less than 25 characters completely.
  • Remove HTML block level elements that have “stop” classes or IDs or tags that indicate that they are definitely not content but something else like e.g navigation, footers, comments or advertisements etc.pp.

After that the reader loops through all paragraphs, and

  • calculates the over-all score for text length by the following formula: 
    rounded-down((pure-Text character count of a page element)/100)
    and adds it to the parent element (you might see it as a container). This means: A paragraph with less than 100 characters of text does not get any bonus at all.
  • adds a base score of 1 for each remaining paragraph to the parent element 
  • assigns a score to them based on how content-y they look. This score gets added to their parent node.
  • adds additional scores that are is determined by things like number of commas (+), class names and IDs (+/-), image and link density (More than 25% of the text is links? Too many images per paragraph? Punish it!) etc.
  • punishes List, Headline, Forms and Address and some other Elements with negative scores because they are normally not part of articles, and if they are, they are usually in the same parent container as the paragraphs in a real article, so the combined score of the parent element is still high enough to count.
  • adds half of the resulting score to the grandparent elements.

When that all is done and the parent or grandparent has a high enough score, it is seen as content and gets displayed, everything else gets removed.

Probably you can imagine now, how many pitfalls are there in which content detection may fall, so please take a break if you see it fail and think about what might have caused it this time.

Personal side note (strong language warning)

All in all content detection is a bіtch and can definitely fail on some pages, especially if the “Webmasters” (I call them Crapmasters) don’t know what a HTML validator is and have never heard about structured pages and accessibility. I am speaking out of experience: Back in 2009 I started with a userscript and later made an made an extension (cleanPages, see the old my.opera page on archive org) based the full original arc90 code and fine tuned it for Opera Presto (and ported it later for the new Opera thing). It had over 250k installs and while it was fun to tweak for better results, it was a hell of a lot of work. I wrote more than 200 versions with generic fixes for “just another couple of new pages that fail” but in the end I gave in and called it a day  – there are too many broken pages out there where the webmasters seemingly do not want people to read the content. Their wish is my command 😉

So please be gentle with the Vivaldi developers – yes, there is still some fine-tuning to be done, but that is really time consuming. It will probably have to wait because there are some other, more difficult and bigger things in the pipe (hint, hint 😀 )

Thank You!

Disclaimer: While I am a “Soprano” (aka external tester for internal builds), all the views in this text are my private views and do not necessarily reflect the views or opinions of Vivaldi (the company) or any of it’s owners or employees.

Followup on styling Vivaldi with CSS

Outdated. No longer needed! We have Themes now 🙂

Following up the post https://quhno.vivaldi.net/2015/07/02/some-quick-vivaldi-panels-css-hacks-for-better-readability-or-accessibility/

I’ve made some smallish changes that take care of some minor things I don’t like with the UI. They can be applied the same way as in the previous blog post.

This time I move the notes editor scrollbar to the right, make the speed dial navigation a little bit less high so that it aligns with the header of the web-panels, moved the speed dial items a bit upwards and colored the panel scrollbars for the dark UI.

Extrensions are great – Extensions inhibit progress

Sounds provocative?
Fine. It was planned to sound like that.

While we all go conform that it is impossible to cover every  need of a user with one software alone and that we therefore need a way to extend it, sometimes extensions can be a blocking stone for other development.
Continue reading “Extrensions are great – Extensions inhibit progress”

Broken CSS? Lint it!

While looking for something completely different I stumbled upon CSS LINT.

Quote:

Will hurt your feelings*

(And help you code better)

http://csslint.net/

Not the usual beautifier but does what it says: It tells you what you can and should improve in a similar rigorous way as JSLint does for JavaScript. Be prepared for a long list of Errors or Warnings 😉

 

While it did not hurt my feelings (Sticks and stones …) it gave me several valuable hints how to improve my CSS-Fu …

Restoring the Old Opera 12 Space Bar Behavior in Vivaldi

In the good old days of “The Real Opera(TM)” there was one feature for lazy people like me that I heavily used: The Space Bar!

I can hear you saying “The space bar is no feature, it is a key on the keyboard!” but that is only half of the truth:

If no edit field was focused but the page itself and you pressed the space bar, “The Real Opera(TM)” scrolled one page down. Nothing special so far, almost all browsers today do that (guess from which browser they copied that function ;)) but the other browsers missed one thing:

If you were on a paginated page or if the page had a LINK or an A rеl="neхt" element, or if simply the linktext contained “Next” or something similar in one of many languages,”The Real Opera(TM)” took you to the next page after you had scrolled to the bottom and hit the space bar again. I am one of those persons who actually read pages until the bottom and often follow the link to the next part of the article, so trying to restore this behavior was a must.

OK. Stop babbling and show us the code instead!

// ==UserScript== 
// @name Restore O12 space bar behavior
// @version 0.19 
// @author Roland "QuHno" Reck 
// @include http://*.* 
// @include https://*.* 
// ==/UserScript== 
(function (window) {
  "use strict";
  /*jshint browser: true, devel: false, debug: true, evil: true, forin: true, undef: true, bitwise: true, eqnull: true, noarg: true, noempty: true, eqeqeq: true, boss: true, loopfunc: true, laxbreak: true, strict: true, curly: true, nonew: true */
  function is_number(obj) {
    return !isNaN(obj - 0);
  }
  function make_ipattern(string) {
    return '/' + string + '/i';
  }
  function split_hash(obj) {
    //return (url + '#a').split('#')[0];
    return (obj.pathname + obj.search);
  }
  function getScrollMaxY() {
    return document.documentElement.scrollHeight - document.documentElement.clientHeight;
  }
  function scrollToPos(el, x, y) {
    el.scrollLeft = x;
    el.scrollTop = y;
  }
  var i,
  j,
  HREF = split_hash(window.location),
  PROTOCOL = window.location.protocol,
  HOST = window.location.host,
  HOSTPATTERN = make_ipattern(HOST),
  A,
  LRN,
  theParent,
  searchDepth,
  stop = false,
  linkContainer = false,
  regexps = {
    trim: /^\s+|\s+$/g,
    normalize: /\s{2,}/g,
    relLink: /^next$/gi,
    nextLink: /(Next\s*(page)?|Neste\s*(side)?|N(ä|ae)chste\s*(Seite)?|Weiter(e.*)?|Vorw(ä|ae)rts|Volg(ende)?\s*(bladsy|pagina)?|Verder|(Page)?\s*Suiv(ant)?(e)?(s)?|(Page)?\s*(prochaine)|Avanti|(Pag(ina)?)?\s*Succ(essiv(e|a|o)|Prossim(e|a|o))|Altr(a|o)|(P(á|a)gina)?\s*(S(e|i)guie?nte)|Próxim(a|os?)|Nästa\s*(sida)?|Næste?\s*(side)?|下一頁|下一页|Sonraki|Следующая|Далее| 下一页|下一张|下一篇|下一章|下一节|下一步|下一个|下页|后页|下一頁|下一張|下一節|下一個|下頁|後頁|다음|다음\s*페이지|次へ|Seuraava|Επόμενη|Следващ(а|о|и)\s*(страница|сайт)?|Нататък|След(в|н)а|(Пълен)?\s*Напред|Dalej|Następn(a|e|y)|Więcej|Tovább|Köv(etkező|\.)|Bővebben|Înaint(ar)?e(ază)?|Avansează|(Pagina\s*)?Următoa?r(e|ul)?|>([^\|]|$)|»([^\|]|$)|→)/i,
    extraneous: /print|archive|comment|discuss|e[\-]?mail|share|reply|all|login|sign|single|teaser/i,
    back: /(back|prev|earl|old|new|zurück|vorige|rückwärts|назад|<|«)/i,
    images: /\.jpe?g$|\.png$|\.webp$|\.gif$/i,
    cut: /[?#]/
  },
  actHREF,
  prevHREF,
  imageResult,
  shouldImagePreview = false,
  shouldFF = false,
  UUID = '43D82723-A99E-4BFD-ACDC-B7D8270EE75C';


  /* *********************************************************************************************
  FIND LINK TO NEXT PAGE
   */
  function analyze() {

    HREF = split_hash(window.location);
    PROTOCOL = window.location.protocol;
    HOST = window.location.host;
    HOSTPATTERN = make_ipattern(HOST);
    LRN = false;
    linkContainer = false;

    //* STEP 1: Look if the webmaster knew rel=  *//
    // If there is a <link rel="next" ... and it it points to the same origin: Use it.
    LRN = document.querySelector('link[rel="next"]');
    if (!!LRN && LRN.href.indexOf(window.location.origin) < 0) {
      LRN = false;
    }
    // If there is a <A rel="next" ... and it it points to the same origin: Use it.
    if (!LRN) {
      linkContainer = document.body.querySelector('A[rel="next"]');
      if (!!linkContainer && linkContainer.href.indexOf(window.location.origin) < 0) {
        linkContainer = false;
      }
    }

    if (!linkContainer) {

      A = document.body.querySelectorAll('A');
      for (i = 1, j = A.length - 2; i < j; [i++]) {
        // Skip this link if does not have the same origin
        if (A[i].href.indexOf(window.location.origin) < 0) {
          continue;
        }

        //* STEP 2: Try to find out if the classes and IDs of the parent elements give a hint on the next link. *//
        actHREF = split_hash(A[i]);
        prevHREF = split_hash(A[i - 1]);
        theParent = A[i];

        // We limit the search depth to 6 to avoid wading through too many elements
        searchDepth = 6;

        stop = false;
        while (searchDepth > 0 && stop === false && theParent.parentElement && theParent.parentElement !== document.body) {
          /*
          if (!theParent.parentElement) {
          // searchDepth = 0;
          break;
          }
           */

          theParent.classAndId = theParent.className + ' z ' + theParent.id;
          searchDepth = searchDepth - 1;

          // This is quite probably the wrong link.
          // May be this should be done outside of the loop because some navigations contain classes like "prev next" in the parent elements
          if (theParent.classAndId.match(regexps.back)) {
            stop = true;
            break;
          }

          // If we don't get "hard evidence" we don't want links in print or discussion things
          if (theParent.classAndId.match(regexps.extraneous)) {
            // stop = true;
            break;
          }

          // this could be the next link, if there is no better match later
          if (theParent.classAndId.match(regexps.nextLink)) {
            linkContainer = A[i];
          }
          theParent = theParent.parentElement;
        }
        //*
        // seems this hurts more than it is worth the time it saves
        if (stop) {
          continue;
        }
        // stop = false;
        //*/

        //* STEP 3: Look for numbered pagination like e.g. in forums *//
        if (is_number(A[i].textContent) && A[i].textContent > 0) {
          if (is_number(A[i - 1].textContent) === false || (A[i - 1].textContent - 0 === 0)) {
            if (is_number(A[i + 1].textContent) && A[i + 1].textContent > 0 && actHREF !== HREF && A[i].textContent !== '1') {
              linkContainer = A[i];
              continue;
            }
          }
          if (is_number(A[i - 1].textContent) && A[i - 1].textContent > 0) {

            // if the previous link is to the page where we are right now it is quite possible that this is the link to the next page
            if (prevHREF === HREF) {
              linkContainer = A[i];
              continue;

              // this is for skipped pagination where the actual page is not linked, so the link to the previous page is 2 lower than that to the next page.
            } else if ((A[i].textContent - A[i - 1].textContent - 0) === 2) {
              linkContainer = A[i];
              continue;
            }
          }
          // I don't know anymore why this was important, but it breaks somewhere if this is not in.
          if (prevHREF === HREF) {
            linkContainer = A[i];
            continue;
          }
        }

        //* STEP 4: Match IDs. Almost as good as rel, therefore: If there is a good ID, we take it and stop searching *//
        if (A[i].id) {
          if (A[i].id.match(regexps.nextLink)) {
            linkContainer = A[i];
            break;
          }
        }

        //* STEP 5: CLASSes are weaker than ID, so store but don't stop. If there is no better Link we take it. *//
        if (A[i].className) {
          if (A[i].className.match(regexps.nextLink)) {
            linkContainer = A[i];
            continue;
          }
        }

        //* STEP 6: We are down to textContent - quite weak, but better than nothing *//
        if (A[i].textContent) {
          if (A[i].textContent.replace(/\s+/g, ' ').length < 25 && A[i].textContent.match(regexps.nextLink)) {
            linkContainer = A[i];
            continue;
          }
        }

        //* STEP 7: TITLE, even weaker ... *//
        if (A[i].getAttribute("title")) {
          if (A[i].textContent.replace(/\s+/g, ' ').length < 25 && A[i].getAttribute("title").match(regexps.nextLink)) {
            linkContainer = A[i];
            continue;
          }
        }

      }

    }

    // Comment this out if you don't want a hint about the next link
    // set borders _and_ outlines because in most cases websites don't alter both at the same time
    if (!!linkContainer) {
      linkContainer.style.border = '1px solid #FF0000';
      linkContainer.style.outline = '1px solid #FFff00';
      if (!document.querySelector('a[rel="next"]')) {
        linkContainer.setAttribute('rel', 'next');
        // Experimental: Maybe the Fast Forward button reacts on that at some time in the future ... probably not.
        var evt = document.createEvent('Event');
        evt.initEvent('load', false, false);
        window.dispatchEvent(evt);
      }
    }
  }
  function doFastForward() {
    /* Hard coded exceptions for fucked up search engine pages */
    var temp = null;
    /* extra saussage for startpage.com */
    if (HOST.indexOf('startpage.com') > -1) {
      temp = document.querySelector('#nextnavbar > form > a > span.i_next');
      console.log('startpage.com fucked up');
      temp.click();
      return false;
    }

    /* Normal stuff */
    /* we've got a hard link rel */
    if (LRN && LRN.href) {
      window.location.href = LRN.href;
      return false;
    }
    if (!!linkContainer) {
      linkContainer.click();
    }
  }

  var handleKeyDown = function (e) {
    if (e.key === "PageDown" && shouldFF) {
      doFastForward();
    }
    // Don't space2next if the focus is in an editable area
    // Did I forget something?
    if (/INPUT|SELECT|TEXTAREA|CANVAS/.test(e.target.tagName) || e.target.isContentEditable || document.designMode === 'on') {
      return;
    }

    // ToDo: If there are linked imageExtensions, add the logic to show them

    // If the page is already scrolled to the bottom and the user hits space, go to the next page if a link was found.
    if (e.keyCode === 32 && shouldFF) {
      doFastForward();
    }
  };

  window.addEventListener('scroll', function (e) {

    // don't analyze if the user did not scroll to the bottom
    if (window.scrollY >= getScrollMaxY() && window.scrollY > 0) {
      analyze();
      shouldFF = true;
    } else if (window.scrollY < getScrollMaxY()) {
      console.log(shouldFF);
      shouldFF = false;
    }
  }, false);
  window.addEventListener('keydown', handleKeyDown, false);
})(window);

 

  • Copy the code into a decent text editor of your choice (Please make sure your editor is set to UTF-8 or you will get problems)
  • Save it as restoreSpaceBar.user.js (yes, with 2 dots.) to some place where you can find it again
  • Open vivaldi://extensions
  • Drag and drop that file on (in?) the page (you might need to enable the developer mode on the extensions management page before)
  • Confirm that you want to install it. That’s it.

 

Warning: It is still a work in progress and may break things.

If you have any idea how to improve it:

Feel free to leave a comment 🙂

PS: I totally forgot to put some tracking code into the monster RegExp 😀