JavaScript scope hoisting is broken
Most modern JavaScript bundlers implement an optimization called "scope hoisting". The idea is that rather than wrapping each bundled module in a function, a compiler concatenates the modules into a single scope. Say you have a program like this:
import {add} from './math';
console.log(add(2, 3));
export function add(a, b) {
return a + b;
}
When scope hoisted, the output bundle would look like this:
function add(a, b) {
return a + b;
}
console.log(add(2, 3));
The compiler renames any top-level variables that might conflict between the two modules, and concatenates them following the import statement order. This idea was popularized by Rollup, and is now implemented across many other bundlers (e.g. Parcel, ESBuild, etc.).
In theory, this is a nice idea. The alternative, which was common before scope hoisting, is to wrap each module in a separate function.
let modules = {
'index.js': (require, exports) => {
let {add} = require('math.js');
console.log(add(2, 3));
},
'math.js': (require, exports) => {
exports.add = function add(a, b) {
return a + b;
}
}
}
This keeps the module scopes separate, but compared with scope hoisting, the bundle is larger, and there is extra indirection through require
and exports
objects at runtime.
Code splitting
The problem with scope hoisting is that it is fundamentally at odds with code splitting. It works great when you are concatenating all of your dependencies into a single bundle. In that case, the imported code runs linearly, and simply replacing import statements with the code they import works correctly.
Code splitting breaks this assumption. Most real-world applications have more than one entry point, e.g. multiple pages, dynamic import()
, etc. These entry points usually have many common dependencies between them, e.g. frameworks like React, libraries like lodash, etc.
Rather than starting from each entry point and inlining all dependencies, bundlers implement smart algorithms to automatically extract common modules between entries into separate bundles. This avoids duplicating code between pages, better utilizing the browser's HTTP cache.
Let me illustrate with an example.
import React from 'react';
import _ from 'lodash';
export function EntryA() {
return <div>Entry A</div>;
}
import React from 'react';
import _ from 'lodash';
export function EntryB() {
return <div>Entry B</div>;
}
A simple bundler would create two bundles:
entry-a.js
+ react.js
+ lodash.js
entry-b.js
+ react.js
+ lodash.js
But this duplicates React and lodash in each entry. Instead, most bundlers split the common dependencies (React and lodash) into a separate bundle so it can be shared between entries:
entry-a.js
entry-b.js
react.js
+ lodash.js
Side effects
Now getting back to scope hoisting. With code splitting, the bundler cannot simply inline import statements anymore. Now some of the modules might be in different bundles. So it might create a new import statement like this to import the shared bundle:
import {React, _} from 'shared.bundle.js';
export function EntryA() {
return <div>Entry A</div>;
}
import {React, _} from 'shared.bundle.js';
export function EntryB() {
return <div>Entry B</div>;
}
export {React, _};
So far, so good.
But JavaScript modules can have more than just exports. They can have arbitrary statements: function calls, variable assignments, etc. These can have side effects on the execution environment. Therefore, JavaScript modules are sensitive to the order they execute. If they run in a different order, the program's behavior may differ.
Let's look at a slightly more complicated example.
import './a1';
import './a2';
import './b1';
import './b2';
import './shared1';
console.log('a1');
import './shared2';
console.log('a2');
import './shared1';
console.log('b1');
import './shared2';
console.log('b2');
console.log('shared1');
console.log('shared2');
When running entry-a.js
without bundling, this code produces the following logs:
When bundling, the code splitting algorithm identifies that shared1.js
and shared2.js
are common dependencies between both entry-a.js
and entry-b.js
, so it splits them out into their own bundle. Then scope hoisting runs, inlining the a1, a2, b1, and b2 modules.
import 'shared.bundle.js';
console.log('a1');
console.log('a2');
import 'shared.bundle.js';
console.log('b1');
console.log('b2');
console.log('shared1');
console.log('shared2');
But now, when running entry-a.bundle.js
, the logs are output in a different order!
This example is broken in Rollup, ESBuild, and Rolldown (though there is an experimental option to wrap modules in functions).
Solution
The solution to this problem is wrapping each shared module in a function as discussed at the start of this post. That way, the execution order of the modules can be controlled.
import modules from 'shared.bundle.js';
modules['shared1']();
console.log('a1');
modules['shared2']();
console.log('a2');
import modules from 'shared.bundle.js';
modules['shared1']();
console.log('b1');
modules['shared2']();
console.log('b2');
export default {
'shared1': () => console.log('shared1'),
'shared2': () => console.log('shared2'),
}
This is basically what Parcel does. Each module that is accessed outside of the bundle that it is declared in is wrapped in a function, which is called by the module that uses it when needed. If a module is wrapped, that also means all of its dependencies must also be wrapped. In practice in real world applications, that means most modules end up being wrapped in a function, pretty much negating most of the benefit of scope hoisting.
Webpack implements something similar. They also support module concatenation, which performs partial scope hoisting for a group of modules that are only accessed within the same bundle (among other conditions). This is likely the most optimal solution while also being correct. Parcel may implement something like this in the future.
Other problems
Side effects are just one of the problems with scope hoisting but there are others. One of them is that it breaks the this
value within exported functions.
import * as foo from './foo';
foo.bar();
export function bar() {
console.log(this);
}
Without bundling, the above example logs the foo.js
module (i.e. an object containing the bar
function).
After bundling with scope hoisting, the bundle might look like this:
function bar() {
console.log(this);
}
bar();
But now the bar()
function is being called directly, with no object property access. Therefore, the this
value is undefined
(in strict mode). This is broken in Rollup, ESBuild, Parcel, Rolldown, and webpack.
This gets even more complicated when re-exports are involved, because the this
value should be the re-exporting module, not the module where the function is declared.
Is scope hoisting worth it?
That's what I've been wondering lately. It seems like an awful lot of complexity for pretty limited optimization potential.
When Rollup was created, it did not support code splitting at all. In that case, scope hoisting can (mostly) work well. But this is a very limited use case – basically bundling libraries or very small applications. Bundling libraries is also not really a good idea either, but that's a topic for another post.
Once code splitting is introduced, scope hoisting has very limited benefit. In my tests with Parcel on real-world apps, less than 5% of the modules didn't end up wrapped in a function. Basically only the modules within entry bundles can be scope hoisted – anything in shared bundles, or accessed via dynamic import must be wrapped. Module concatenation as webpack implements may help with this, but even then there are a lot of cases where the optimization bails out.
Scope hoisting was also supposed to help with tree shaking. By making references between modules static variable accesses rather than going through a function, minifiers could work better because they could "see" across modules and remove code that wasn't used. But that's only a benefit if you're relying on a general purpose minifier to implement tree shaking. Bundlers can also collect this information themselves and perform tree shaking even when modules are wrapped, and even when referenced between different bundles.
One other potential benefit of scope hoisting is runtime performance. In 2016, Nolan Lawson wrote a post about the cost of small modules. He attempted to measure both the bundle size and runtime cost of function wrappers around modules. At the time, however, only Rollup implemented tree shaking, so the bundle size cost was larger than in reality. The runtime cost (basically the cost of object property access vs static variable reference) is interesting. I would be curious to test this again using modern hardware and JS engines and see if it's still a factor today. On the other hand, lazily evaluating modules only when they are needed rather than all up front can have a performance benefit too.
So in summary: I'm not sure scope hoisting is worth it anymore, and I'm considering removing it in Parcel v3. Of course, we'll still have tree shaking, dead code elimination, constant folding, and other optimizations, but I'm considering a design where modules are wrapped in functions by default to improve correctness and reduce complexity. I'll report back how it goes.