Monday, Aug 7: 2:55 PM - 3:20 PM
Invited Paper Session
Metro Toronto Convention Centre
Dynamic treatment regimes formalize precision medicine as a sequence of decision rules, one for each stage of clinical intervention, that map current patient information to a recommended intervention. Optimal regimes are typically defined as maximizing some functional of a scalar outcome's distribution, e.g., the distribution's mean or median. However, in many clinical applications, there are multiple outcomes of interest that are not easy combined into a single scalar outcome. We consider the problem of estimating an optimal regime when there are multiple outcomes ordered by priority but which cannot be readily combined by domain experts into a single scalar outcome. We propose a definition of optimality in this setting and show that this definition leads to maximal mean utility under a large class of utility functions. Furthermore, we use inverse reinforcement learning to identify a composite outcome that most closely aligns with our definition within a pre-specified class. Simulation experiments and an application to data from a sequential multiple assignment randomized trial (SMART) on HIV/STI prevention illustrate the usefulness of the proposed approach.