In practice, machine learning systems deal with multiple datasets over time. When
the feature spaces between these datasets overlap, it is possible to transfer
information from one task to another. Typically in transfer learning, all labeled
data from a source task is saved to be applied to a new target task thereby raising
concerns of privacy, memory and scaling. To ameliorate such concerns, we present a
semi-supervised algorithm for text categorization that transfers information across
tasks without storing the data of the source task. In particular, our technique
learns a sparse low-dimensional projection from unlabeled and the source task data.
In particular, our technique learns low-dimensional sparse word clusters-based
features from the source task data and a massive amount of additional unlabeled
data. Our algorithm is efﬁcient, highly parallelizable, and outperforms competitive
baselines by up to 9% on several difﬁcult benchmark text categorization tasks.