Wi-Fi based indoor localization has gained much attention around the globe due to its widespread reach and availability. Amongst several possible approaches using Wi-Fi signals, fingerprint image-based approach has become popular due to its low hardware requirements. Further, this approach can be used alone or along with other positioning systems for indoor localization. However, a multi-building, multi-floor indoor positioning system with high localization accuracy is required. Motivated by this, we propose a Convolutional Neural Networks (CNN)-based approach. For feature extraction and classification, a multi-output multi-label sequential 2D-CNN classifier is developed and implemented. The system is able to predict the location of the user by combining the classification output from the multi-output model. This approach is verified on the publicly available UJIIndoorLoc database. The system offers an average accuracy of 97% in indoor localization.