Deep learning has become popular and the mainstream technology in many researches related to learning, and has shown its impact on photogrammetry. According to the definition of photogrammetry, that is, a subject that researches shapes, locations, sizes, characteristics and inter-relationships of real objects from optical images, photogrammetry considers two aspects, geometry and semantics. From the two aspects, we review the history of deep learning and discuss its current applications on photogrammetry, and forecast the future development of photogrammetry. In geometry, the deep convolutional neural network (CNN) has been widely applied in stereo matching, SLAM and 3D reconstruction, and has made some effects but needs more improvement. In semantics, conventional methods that have to design empirical and handcrafted features have failed to extract the semantic information accurately and failed to produce types of “semantic thematic map” as 4D productions (DEM, DOM, DLG, DRG) of photogrammetry. This causes the semantic part of photogrammetry be ignored for a long time. The powerful generalization capacity, ability to fit any functions and stability under types of situations of deep leaning is making the automatic production of thematic maps possible. We review the achievements that have been obtained in road network extraction, building detection and crop classification, etc., and forecast that producing high-accuracy semantic thematic maps directly from optical images will become reality and these maps will become a type of standard products of photogrammetry. At last, we introduce our two current researches related to geometry and semantics respectively. One is stereo matching of aerial images based on deep learning and transfer learning; the other is precise crop classification from satellite spatio-temporal images based on 3D CNN.