Batch gradient descent computes the gradient of the cost function over entire training dataset before making an update. Theoretically, this sounds good since we want to model our input dataset, let us say X, best, however it can be computationally quite expensive. How can we reduce this cost? Well, we could decrease the size of inputs. But, we want … Continue reading Why does batch size matter?
This was a pretty smooth week, a bit inclined towards nice I would say just like the title suggest. Got to stub few of the curiosities I was having for a long time, without harmful damage for the foreseeable future. Hence, Lumiere. This weekend London is hosting the Lumiere festival. It is kind of a … Continue reading 3/ Lumiere