Abstract Subsurface earth models, also known as geomodels, are essential for characterizing and developing complex subsurface systems. Traditional geomodel generation methods, such as multiple-point statistics, can be time-consuming and computationally expensive. Generative Artificial Intelligence (AI) offers a promising alternative, with the potential to generate high-quality geomodels more quickly and efficiently. This paper proposes a deep-learning-based generative AI for geomodeling that comprises two deep learning models: a hierarchical vector-quantized variational autoencoder (VQ-VAE-2) and a PixelSNAIL autoregressive model. The VQ-VAE-2 learns to massively compress geomodels into a low-dimensional, discrete latent representation. The PixelSNAIL then learns the prior distribution of the latent codes. To generate a geomodel, the PixelSNAIL samples from the prior distribution of latent codes and the decoder of the VQ-VAE-2 converts the sampled latent code to a newly constructed geomodel. The PixelSNAIL can be used for unconditional or conditional geomodel generation. In unconditional generation, the generative workflow generates an ensemble of geomodels without any constraint. In conditional geomodel generation, the generative workflow generates an ensemble of geomodels similar to a user-defined source geomodel. This facilitates the control and manipulation of the generated geomodels. To improve the generation of fluvial channels in the geomodels, we use perceptual loss instead of the traditional mean absolute error loss in the VQ-VAE-2 model. At a specific compression ratio, the proposed Generative AI method generates multi-attribute geomodels of higher quality than single-attribute geomodels.