In this article, we provide longitudinal evidence for the progressive routinization of a grammatical construction used for social coordination purposes in a highly specialized activity context: task-oriented video-mediated interactions. We focus on the methodic ways in which, over the course of 4 years, a second language speaker and initially novice to such interactions coordinates the transition between interacting with her coparticipants and consulting her own screen, which suspends talk, without creating trouble due to halts in progressivity. Initially drawing on diverse resources, she increasingly resorts to the use of a prospective alert constructed around the verb to check (e.g., "I will check"), which eventually routinizes in the lexically specific form "let me check" as a highly context- and activity-bound social action format. We discuss how such change over the participant's video-mediated interactional history contributes to our understanding of social coordination in video-mediated interaction and of participants' recalibrating their grammar-for-interaction while adapting to new situations, languages, or media. Data are in English.