It’s not only a matter of ensuring the computation is split in the most efficient way, it’s building in redundancy. Each machine will compute more than what is strictly needed to complete a calculation, and that little extra will make the communication phase faster — even if it takes more memory to complete the local computation.
“You are trading off local storage space for computation speed. We call it communication load. There are less things you need to communicate,” Tuninetti said. “We are trying to figure out the tradeoff between those storage space resources and the number of things you have to communicate, either through a central server or other workers in distributed computation problem.”
Tuninetti will be using error-correcting codes to take care of the distributed nature of the computations. While codes have been around for decades to protect information from various communication impairments, only in recent years have they been applied in distributed computing frameworks. Erasure-correcting codes cleverly add redundancy to the information data so that you can recover your data even if some pieces get lost (i.e., are erased). They essentially fill in the blanks of what information you don’t have, by treating it as if you “lost” data, even if you never received it in the first place as part of your machine’s piece of a computation. Those codes can be used to speed up the communication phase among machines.
“Part of this grant is using codes to speed up data shuffling when machines have to exchange information—this is what I’m most excited about,” Tuninetti said.
The other part of the grant involves pre-loading information a user may want into its local memory ahead of time, to save on the overall number of transmissions once the user requests a piece of content. This reduces the communication load on congested networks during peak times.
“The idea is that we can somehow store at 3 a.m. in the memory of a phone something the user may want to watch or listen to at 5 p.m. If it’s stored locally, I don’t have to get it from the network, and it’s a way to balance the network utilization,” Tuninetti said.
“About five years ago people realized we shouldn’t look at each device individually but in a network. Not just what a particular user would want, but what all the users globally would want,” Tuninetti said.” The reduction in the communication load scales proportionally to the amount of memory globally available in the network. The more users, the merrier.”
Reducing congestion load: an example
Serving many users at once, or “creating multicast opportunities,” is key to reducing congestion load. For example, say Alice and Bob want to listen to two different songs. Assume that Alice has already stored locally Song A, which she listened to at some point in the past, and now wants to listen to Song B. At the same time Bob has already stored locally Song B and now wants to listen to Song A.
A classical approach is to deliver first Song B to Alice, and then Song A to Bob, for a total of two songs.
Instead, Alice and Bob can be served at the same time by delivering the “coded data” Song A + Song B, which has the size of just one single song. Alice and Bob can take the coded data and subtract off what they have stored locally; this allows each of them to recover their desired song. As both songs were sent in one transmission, the new “coded delivery” has cut in half the number of required transmissions.