In this paper, our objective is to simultaneously explore the learning of ordinal relationships among age labels and address the challenge of heterogeneous data resulting from the non-stationary aging process through an advanced mixture model of deep networks. Drawing upon the pivotal insight that the non-stationary aging process can be decomposed into a series of stationary subprocesses, we employ a divide-and-conquer strategy. This involves initially partitioning the age spectrum into multiple groups and subsequently training a specialized deep network, referred to as an "expert", for each distinct group. These experts are not functionally independent; instead, they are interconnected through specialized model designs and a joint training mechanism that consolidates them into a unified system. As a result, the learning of ordinal relationships is consistently maintained by solving the age-related tasks across the entire age label set. The final age estimation is accomplished through a hierarchical classification approach, leveraging the collective outputs from all the experts. Extensive experiments involving several well-known datasets for age estimation have demonstrated the superior performance of our proposed model over several existing state-of-the-art methods.