Emotion recognition via electroencephalography (EEG) has emerged as a pivotal domain in biomedical signal processing, offering valuable insights into affective states. This paper presents a novel approach utilizing a tailored Transformer-based model to predict valence and arousal levels from EEG signals. Diverging from traditional Transformers handling singular sequential data, our model adeptly accommodates multiple EEG channels concurrently, enhancing its ability to discern intricate temporal patterns across the brain. The modified Transformer architecture enables comprehensive exploration of spatiotemporal dynamics linked with emotional states. Demonstrating robust performance, the model achieves mean accuracies of 92.66% for valence and 91.17% for arousal prediction, validated through 10-fold cross-validation across subjects on the DEAP dataset. Trained for subject-specific analysis, our methodology offers promising avenues for enhancing understanding and applications in emotion recognition through EEG. This research contributes to a broader discourse in biomedical signal processing, paving the way for refined methodologies in decoding neural correlates of emotions with implications across various domains including brain-computer interfaces, and human-robot interaction.