This paper provides a comparative performance evaluation of well-known feature detection and description algorithms. Although numerous comparisons of features have been studied for natural visual images, their performance on surgical endoscopic images is less discussed so far. We perform a thorough comparison of various feature algorithms for structure from motion to sparsely reconstruct monocular endoscopic video images. Our contribution lies in two aspects: (1) thoroughly investigate and evaluate the performance of most well-known local feature detection and descriptor methods for structure from motion and (2) systematically compare their performance for sparse depth estimation and reconstruction. According to our investigation, we achieve novel and useful insights on applying current local features to monocular endoscopic image sparse depth estimation that will be used for self-supervised learning based dense depth recovery.