Kaggle – Bimbo Group Wrap-up


I scored top 65% ranking on the private leaderboard, which counts as the official score for this contest. Of 1,969 teams, my team (myself) ranked 1,261 using RMSLE as the measurement of accuracy with 0.56330. 



For the public leaderboard however, I got 18% ranking instead with an RMSLE of 0.45970.

This second score is a lot better because I was able to submit my model for scoring, up to three times a day, and refine my model accordingly.  At some point, I was was ranked top 11% but this was short lived, lasting no more than a week or so before some statistic giants woke up and ate my lunch.


Continue reading

Kaggle – Grupo Bimbo Preparing The Data

With the database from the last post in mind, we can now go over the information provided for this contest.  Most interesting to me, is the distribution of inventory delivered versus inventory returned. 


Above, we can see the number of units sold each week.  The green portion of the bar indicates the number of units consumed and the red portion indicates the number of units returned (unsold) from the previous week.


Here we can see the monetary amount for units sold per week, together with the monetary amount not sold from the units returned the from the previous week. 

Lets prepare the data that gets us here.

Continue reading

Kaggle – Grupo Bimbo Inventory Demand


For complete information on this competition, please go to Maximize sales and minimize returns of bakery goods.  In a nutshell, Group Bimbo, makers of cookies from our childhood, presents an optimization problem with a lot of data in the hopes of delivering the right amount of inventory to meet, but not over estimate, demand.

My interest in this competition comes from a random email from Kaggle and a fondness for cookies common in lunchboxes of our youth.  Zero Kaggle experience and equal experience in the problem at hand makes for an interesting problem to look at.

Continue reading