We consider the issue of scalability of graph-based semi-supervised learning (SSL)
algorithms. In this context, we propose a fast graph node ordering algorithm that
improves parallel spatial locality by being cache cognizant. This approach allows
for a linear speedup on a shared-memory parallel machine to be achievable, and thus
means that graph-based SSL can scale to very large data sets. We use the above
algorithm an a multi-threaded implementation to solve a SSL problem on a 120
million node graph in a reasonable amount of time.