We consider a repeated newsvendor problem where the demand distribution is unknown ex ante and has to be learned from sales/censored data. To shed light into scenarios where the demand may be non-stationary, e.g., exhibiting trends, seasonalities, or strategic behavior on the side of customers, we view the problem as a “game" between the inventory manager and an oblivious opponent who, prior to the beginning of the game, decides a sequence of demands for the different periods arbitrarily. We propose a randomized inventory management policy that performs well with respect to the regret criterion, i.e., the difference between the policy’s cumulative cost and the cumulative cost of the best fixed action/ordering decision in hindsight, for any given demand sequence. More specifically, the proposed policy achieves optimal regret scaling (up to logarithmic terms) with respect to the number of periods, the demand support, and the number of ordering decisions available to the inventory manager. This is joint work with Gabor Lugosi and Gergely Neu.