A transformer is an emerging neural network model that employs an attention mechanism. It has been adopted to various tasks and achieved a favorable accuracy compared to CNNs (Convolutional Neural Networks) and RNNs (Recurrent Neural Networks). Although the attention mechanism is recognized as a general-purpose component, many of the transformer models require a significant number of parameters and thus they are not suited to low-cost edge devices. Recently, a resource-efficient hybrid model that uses ResNet as a backbone architecture and replaces a part of its convolutional layers with an MHSA (Multi-Head Self-Attention) mechanism was proposed. In this paper, we significantly reduce the parameter size of this approach by using Neural ODE as a backbone architecture for the MHSA mechanism. The proposed hybrid model reduces the parameter size by 97.3% compared to the original model without degrading the accuracy. Since the model size is quite small, it is implemented on Xilinx ZCU104 FPGA (Field Programmable Gate Array) board so that it can fully exploit on-chip BRAM/URAM resources. The FPGA implementation is evaluated in terms of resource utilization, accuracy, performance, and power consumption. The results demonstrate that it speeds up the model by up to 2.63 times compared to a software execution without accuracy degradation.