Due to ambiguity of user queries and growing size of data living in the internet, methods for diversifying search results has gained more importance lately. While earlier works mostly focus on text search, a similar need also exists for image data, which grows rapidly as people produce and share image data via their smartphones and social media applications such as Instagram, Snapchat, and Facebook. Therefore, in this thesis, we focus on the result diversification problem for image search. To this end, as our first contribution, we adopt R-LTR, a supervised learning approach that has been proposed for textual data, and modify it to allow tuning the weights of visual and textual features separately, as would be required for better diversification. As a second contribution, we extend R-LTR by applying an alternative paradigm that takes into account an upperbound for the future diversity contribution that can be provided by the result being scored. We implement R-LTR and its variants using PyTorch’s neural network framework, which enable us to go beyond the original linear formulation. Finally, we create an ensemble of the most promising approaches for the image diversification problem. Our experiments using a benchmark dataset with 153 queries and 45K images reveal that the adopted supervised algorithm, R-LTR, significantly outperforms various ad hoc diversification approaches in terms of the sub-topic recall metric. Furthermore, certain variants of R-LTR proposed here are superior to the original method and may provide additional (relative) gains of up to 2.2%.