My research interests lie broadly in online learning and optimization, and asymptotic analysis, with a particular emphasis on the application of these methods in the field of inventory and revenue management.

Research Directions

Low-dimensional Heuristics Design for Operations Management

Many supply chain and revenue management problems have increasingly complicated features, e.g., positive lead times, multi-echelon, multi-products, and fluctuated demand environment. Therefore, the optimal policies for these problems usually have a complex structure, which may make the policies hard to implement and result in customers’ strategic behavior. When the dimensions of these features are large, the optimal policies are even computationally intractable due to the notorious curse of dimensionality. In this direction, we investigate how to design low-dimensional heuristics that are decided by a few parameters, such as base-stock, $(s,S)$, and static pricing policies. We want the new heuristic to be both easy to implement in practice and have a good theoretical guarantee.

Learning and Optimization of Structured Policies for Markov Decision Processes

Many Markov decision processes are shown to have structured optimal policies or near-optimal low-dimensional heuristics as discussed above. However, existing studies mainly show the (near) optimal policy falls into certain policy families, but do not give algorithmic methods to search for the best one. In this direction, we investigate how to design new algorithms for optimizing structured policies for Markov decision processes. For example, base-stock policies and constant order policies in inventory control. We also try to provide new insights into classical data-driven methods, such as Stochastic Gradient Descent (SGD) and Sample Average Approximation (SAA).