Implementation: Web browser extension

Jérôme Beau
7 min readJul 26, 2024

--

Browser extensions allow to extend browser’s (Chrome, Firefox, Safari, Edge, Opera) functionality through custom JavaScript executed at various times and/or events occurring during the browsing experience. Once published and installed from the browser’s extensions store, an extension can, for instance:

  • add an extension button in the browser toolbar (to trigger an action or display a popup)
  • interact with the browsed web page document through a “content script”
  • augment browser’s contextual menus
  • save info in web storage (for user preferences or cache, typically)
  • change the default web page (when opening a new tab)

How

With no surprise, you’ll learn that every web browser started featuring its own extension system. Hopefully, since then, their design has converged as follows:

  • content scripts injected into browsed pages. With the proper permissions, they can read and/or write the browsed DOM;
  • background scripts performing operations not directly related to the browsed page, such as issuing HTTP requests, caching or performing an action (such as displaying a popup) when the extension button is clicked.

Because they‘re not executed in the same thread, content and background scripts communicate through messages (more on this later).

Note that none of those scripts is mandatory: you can just add a toolbar button without accessing browsed pages (and so not ask the user for the permissions), or perform background operations without displaying a popup, or even interact with the page using some script execution instead of a content script (but that would require different permissions).

Files

Accordingly, files to be packaged in a web extension are (still optionally):

  • content script(s) files such as a content.js file for instance;
  • background file(s) such as abackground.js for instance;
  • Browser action files such as ones to display a popup (popup.html, popup.js, popup.css for instance) when the extension button is clicked;
  • new tab files (newtab.html, newtab.js, newtab.css for instance) if you choose to override the default “new tab” URL of of the browser. As for the extension popup, you should take care of supporting dark mode here.
  • a _localessubdirectory to host translations, if any;
  • a manifest.json file to sum all that up. While V2 is still supported, you should target V3 syntax (a.k.a. “MV3”).

However Safari require setting up a full-fledged XCode project. Hopefully, the “Resources” of such a project are nearly identical to the above-mentioned web resources directory, so that you can build any extension as a subset of the Safari one.

The “standard” part of a Safari web extension project can be used a the source for a cross-browser extension

API

APIs have notoriously been proprietary for a long time. For instance, the API to send a message in Chrome and Opera is:

chrome.runtime.sendMessage({greeting: "hello"})

whereas in Safari, Firefox and Edge it is:

browser.runtime.sendMessage({ greeting: "hello" })

So, even with Firefox also supporting the chrome.* namespace for convenience, building an extension for both these families of browsers, you had to develop two codebases… before the rise of the WebExtension API 🥳.

This API aims to standardize browser extensions under the browser.* namespace, including the use of Promises instead of callbacks. But it is not fully implemented by all browsers 😞 : Chrome is still sticking on chrome.*, Edge supports browser.* but still with callbacks, etc. So, in any case, you should use a polyfill which aligns all APIs before all your JS files. For instance in your manifest.json:

{
// ...

"background": {
"scripts": [
"browser-polyfill.js",
"background.js"
]
},

"content_scripts": [{
// ...
"js": [
"browser-polyfill.js",
"content.js"
]
}]
}

For HTML documents, such as browserAction popups, or tab pages, it must be included more explicitly:

<!DOCTYPE html>
<html>
<head>
<script type="application/javascript" src="browser-polyfill.js"></script>
<script type="application/javascript" src="popup.js"></script>
</head>
<!-- ... -->
</html>

Also note that API runtime availability will depend on whether:

  • you have enabled/configured the API in your manifest.json;
  • you’re using the chrome.* or browser.* namespace, with or without the polyfill;
  • you’re using API names matching your manifest version or the polyfill version.
  • you’re using an API that is available on your browser + platform (for instance, some chrome.* APIs are really Chrome-specific and may only be available on some Chrome versions or even ChromeOS, for instance).

For instance, a browser.action API will be available only:

  • it you’re using Firefox/Safari/Edge or included the polyfill,
  • and you’re in a background script (this API is not available from the contents scripts)
  • and you’re using the MV3/polyfill action name instead of the MV2 browserAction name.

Background tasks

Extensions should provide a service without impairing browser usage, especially regarding performance, and this is why “background” pages (background or button/popup) run in a parallel thread. This way, they won’t block browsing even when performing computing-intensive tasks because they cannot access the browsed DOM.

Another reason their lifecycle is uncorrelated from browsed pages in that this makes them able to maintain an extension state across multiple web pages.

Such background pages started as HTML files containing <script> tags (or iframes). You would declare them in your manifest.json as below:

{
"background": {
"page": "my-background.html"
}
}

Then, as most of use cases only required scripts, the scripts option was added (in such a case, an HTML page will the relevant <script> tags is generated on the fly):

{
"background": {
"scripts": ["jquery.js", "my-background.js"]
}
}

Since then, manifest V3 introduced the support a standard way of handling all of this: service workers as extension-dedicated web workers:

{
"background": {
"service_worker": "my-sw.js"
}
}

Chrome only supports this kind of background script… but Firefox does not (yet). This incompatibility between V2 and V3 support implies some tricky configuration in order to define cross-browser background tasks.

Messages

Because they aren’t executed by the same thread, background pages and content scripts communicate through asynchronous messages. For instance, a content script might listen for a given message:

browser.runtime.onMessage.addListener((message, sender, sendResponse) => {
if (message.someType === "giveMeText") {
const pageText = document.body.textContent // DOM elements are not serializable as a whole
sendResponse({pageText})
}
})

and a background script might send such a message to use that DOM data:

browser.tabs.onActivated.addListener(async (activeInfo) => {
browser.tabs.sendMessage(activeInfo.tabId, {someType: "giveMeText"}).then(response => {
console.log("Text from tab is", response.pageText)
}
})

and vice versa: the content script could send a message to background scripts and wait for a response… if the background script is ready.

Indeed, while initially designed to be loaded automatically and remain in memory (i.e. be persistent), background scripts were soon optimized as being loaded only on-demand (i.e. when the buttosn is pressed, when a listened event is triggered — this is why background pages are sometimes called “event pages”).

For instance, if you’re sending a message to a popup script before the extension button is pressed, you’ll get: Error: Could not establish connection. Receiving end does not exist until the popup is opened and so ready to listen.

So, because background scripts can be disposed at any time, you cannot rely on any memory state anymore, and have to use a more clever design instead, such as:

  • instead of sending messages to the popup, make the popup ask for data;
  • write the data to the storage and make the data read it (when browser.storage.onChanged typically);

Test

Every browser allows to manually test a non-packaged (i.e. non-zipped) extension:

  • Chrome through chrome://extensions then Load Unpacked Extension. The extensions UI will display potential errors occurring when loading/executing the provided extension.
  • Firefox through about:debugging#/runtime/this-firefox then Load Temporary Add-on
  • Safari through a run of the extension project from XCode. This will install the app in your local Safari.

Deployment

While file structure and APIs have somewhat been unified, each publication process is specific to the targeted “store”. However any of them will require the same steps:

  1. Packaging the extension as a ZIP file;
  2. Connecting with developer account, with 2-steps verification enabled.

Specifics will be:

  • Chrome extensions have to be published on the Chrome Web Store, using your Google developer account (which requires a 5$ one-time registration fee). You can both target desktop & Android Chrome, but beware of testing on Android first.
  • Submitted on Firefox store requires having a Firefox account. automated checks are more picky than the Chrome one: they will warn you about hidden files, dynamic parts in code and lack of browser_specific_settings/ manifest ID. They will also complain if you provide a ZIP file of the extension’s directory instead of the directory contents.

Review

Your submitted code will be reviewed automatically and, depending on its complexity, possibly manually. Should you want to ease the process, make sure to:

  • avoid any unnecessary files;
  • avoid any unnecessary variabilities (use constants, inlined values);
  • avoid build tools: plain javascript without any further build process will be easier to check and test.

Conclusion

Browser extensions are like the JavaScript of JavaScript: they initially seemed very proprietary, very obscure and poorly documented, very varying over time… so that only a few developers attempted to look at it.

Today however, like JavaScript again, they reached a level of maturity that nearly meets the standards of industrial development, aside a few minor discrepancies and support level.

Developing an extension requires a deep understanding of the lifecycle of each script type, notably their ability to be not or un-loaded, as well as the ability to design a resilient solution to such conditions.

--

--