A load balancing system consists of a dispatcher that routes the incoming jobs to the queues of a subset of homogeneous servers. When the instantaneous queue length or queue workload is known, the optimal dispatch policy is to join the shortest queue (JSQ) or join the queue with the smallest workload (JSW). However, in most real-life systems, this information is not available. One way to tackle this problem is to query this information from a random subset of these servers and implement this policy on that subset. However, this policy becomes intractable in large systems.
In redundancy-based load balancing policies, the dispatcher creates replicas of the incoming job and sends it to a subset of the servers, thus avoiding the need to query instantaneous state information. However, these policies require dispatcher-side cancellation of the redundant jobs once the job has been served by one of the servers. Synchronized cancelling of the redundant replica is an implementation challenge. This work proposes a policy in which the dispatcher duplicates each job and appends a timer to each replicated job referred to as a server-side cancellation criterion. A replica is discarded if its timer expires before it starts receiving service. The authors analyze several variants of this policy which are novel and simple to implement. Numerical computations on variants of the proposed policy indicate that the proposed policy outperforms popular feedback-based policies for low arrival rates, despite no feedback from servers to the dispatcher.