Urban environments play a crucial role in the design, planning, and management of cities. Recently, as the urban population expands, the ways in which humans interact with their surroundings has evolved, presenting a dynamic distribution in space and time locally and frequently. Therefore, how to better understand the local urban environment and differentiate varying preferences for urban areas has been a big challenge for policymakers. This study leverages geotagged Flickr photographs to quantify characteristics of varying urban areas and exploit the dynamics of areas where more people assemble. An advanced image recognition model is used to extract features from large numbers of images in Inner London within the period 2013–2015. After the integration of characteristics, a series of visualisation techniques are utilised to explore the characteristic differences and their dynamics. We find that urban areas with higher population densities cover more iconic landmarks and leisure zones, while others are more related to daily life scenes. The dynamic results demonstrate that season determines human preferences for travel modes and activity modes. Our study expands the previous literature on the integration of image recognition method and urban perception analytics and provides new insights for stakeholders, who can use these findings as vital evidence for decision making.