Follow @ProbabilityPuz All too often, when we deal with data the outcome needed is a strategy or an algorithm itself. To arrive at that strategy we may have historic data or some model on how entities in system respond to various situations. In this write up, I'll go over the method of reinforcement learning. The general idea behind reinforcement learning is to come up with a strategy to maximize some measurable goal. For example, if you are modelling a robot that learns to navigate around obstacles, you want the learning process to come back with a strategy that minimizes collisions (say) with other entities in the environment. Pattern Recognition and Machine Learning (Information Science and Statistics) For the sake of simplicity, lets assume the following scenario. A robot is placed (at random) on flat plank of wood which has some sticky glue in the center. To its left there is a hole which damages the robot a bit and to its right is a reward which is its dest