I have a problem where I have up to 10,000 entities and I’m trying to obtain rankings of these entities from pairwise comparisons (in each comparison one entity is a winner, the other is a loser), doing the least amount of pairwise comparisons.
There is a paper proposing an algorithm for such a problem and I want to have a working implementation for it [1]. The paper does have a code example on GitHub but the provided example implementation on GPU doesn’t work when there are more than 200 entities, it gives an out-of-memory error.
So your job would be to
- Evaluate whether there is a memory usage issue related to the complexity of the algorithm, meaning can this algorithm work for me if I have 10,000 entities and I don’t have a supercomputer? I could get 10-20 rented GPUs from vast.ai if each iteration of the algorithm would last up to 10 min)
- Fix the GPU implementation to make it properly work when there are many entities, plus add support for distributed computing (e.g. maybe via Ray).
[1] “Active Sampling for Pairwise Comparisons via Approximate Message Passing and Information Gain Maximization” https://arxiv.org/pdf/2004.05691.pdf
[2] GitHub - gfxdisp/asap: Fast and accurate Active SAmpling method for Pairwise comparisons