Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this thesis, we propose a variant of Boltzmann Machines (BMs) for contextualized scene modeling. Although many computational models have been proposed for the problem, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are introduced with shared, tri-way connections. We evaluate our method in comparison with several baselines on missing or out-of-context object detection, relation estimation, and affordance estimation tasks. Moreover, we also illustrate scene generation capabilities of the model.