Recent popularity of mobile devices increased the demand for mobile network services and applications that require minimal delay. 5G mobile networks are expected to provide much lesser delay than the present mobile networks. One of the conventional ways for decreasing the latency is caching the content closer to the end user. However, currently deployed methods are not effective enough. In this thesis, we propose a new astute caching strategy that is able to smartly predict subsequent user requests and prefetch necessary contents to remarkably decrease the end-to-end latency in 5G systems. We employ semantic inference by mobile edge computing, deduce what the end-user may request in the sequel and prefetch the content.