Abstract
Abstract
Efficient retrieval of relevant documents from massive
 collections remains an essential challenge in Information Retrieval
 (IR). Modern search engines face immense computational demands,
 requiring novel approaches that reduce resource usage without
 sacrificing retrieval effectiveness. This thesis makes significant
 contributions to improving search efficiency through three distinct
 methods. First, we introduce innovative document reordering
 techniques specifically optimized for dynamic pruning algorithms.
 Our proposed methods achieve up to 1.33x speed-up in query
 processing, accompanied by negligible increases in index size and
 minimal impact on retrieval quality. Second, we present novel sparse
 centroid retrieval strategies tailored to the ColBERT neural
 retrieval model. These techniques accelerate ColBERT-based retrieval
 by up to 4.6x while maintaining high effectiveness and minimal
 additional indexing overhead. Lastly, we propose novel static
 pruning methods for ColBERT document embeddings that eliminate
 approximately one-third of the tokens from indexed documents without
 any loss in retrieval effectiveness. Critically, our pruning methods
 require no separate training stages, ensuring ease of integration
 into existing retrieval systems. Collectively, these contributions
 offer substantial advancements in retrieval efficiency, making
 large-scale IR systems faster, more scalable, and economically
 sustainable.