The orbitofrontal cortex (OFC) and ventromedial-prefrontal cortex (vmPFC) play a key role in decision-making and encode task states in addition to expected value. We review evidence suggesting a connection between value and state representations and argue that OFC / vmPFC integrate stimulus, context, and outcome information. Comparable encoding principles emerge in late layers of deep reinforcement learning (RL) models, where single nodes exhibit similar forms of mixed-selectivity, which enables flexible readout of relevant variables by downstream neurons. Based on these lines of evidence, we suggest that outcome-maximization leads to complex representational spaces that are insufficiently characterized by linear value signals that have been the focus of most prior research on the topic. Major outstanding questions concern the role of OFC/ vmPFC in learning across tasks, in encoding of task-irrelevant aspects, and the role of hippocampus-PFC interactions.