The trick to get around this is to move smoothly up and down the gradient of social interaction intensity, never dropping below a basic floor of presence: the sense that there are other people in the same place as you.
Instead of having two modes, “in a call” and “on my own,” we need to think about multiple ways of being together which, minimally, could be:
In a video call
In an anteroom to a video call, hearing the sound of others
In a doc together
On my desktop but with the sense that colleagues are around
And the job of the designer is to ensure that their software ensures the existence of these different contexts, instead of having the binary on-a-call/not-on-a-call, and to design the transitions between them.
Super Nintendo games were the flavor of the decade when I was younger, and there’s no better example of building incredible things within comparably meager constraints. Developers on SNES titles were limited to, among other things:
16-bit color.
8 channel stereo output.
Cartridges with storage capacities measured in megabits, not megabytes.
Limited 3D rendering capabilities on select titles which embedded a special chip in the cartridge.
Despite these constraints, game developers cranked out incredible and memorable titles that will endure beyond our lifetimes. Yet, the constraints SNES developers faced were static. You had a single platform with a single set of capabilities. If you could stay within those capabilities and maximize their potential, your game could be played—and adored—by anyone with an SNES console.
PC games, on the other hand, had to be developed within a more flexible set of constraints. I remember one of my first PC games had its range of system requirements displayed on the side of the box:
Have at least a 386 processor—but Pentium is preferred.
Ad Lib or PC speaker supported—but Sound Blaster is best.
Show up to the party with at least 4 megabytes of RAM—but more is better.