Little is known about young CLIL (Content and Language Integrated Learning) learners’ attention to formal aspects of the target language when engaged in collaborative task-based interaction. Previous research on language-related episodes (LREs) with other populations indicates that certain variables (e.g. target language proficiency or pair formation method) may play a role in the production of LREs. This study investigates the amount, types and resolution of LREs produced by primary education CLIL learners in a collaborative picture-ordering + story-telling task depending on two variables – L2 English proficiency (grade 5 dyads vs. grade 6 dyads) and pairing method (proficiency-matched dyads vs. student self-selected dyads). Findings indicate that young CLIL learners’ interactive behaviour in L2 English, at least in terms of LRE production, does not differ as a consequence of target language proficiency, whereas pair formation method exerts some influence, self-selected pairs producing and resolving more meaning-based LREs. No differences were found for form-focused LREs.