Towards Emancipatory L2 Instruction: Exploring Significant Learning Outcomes from Collaborative Digital Storytelling

Digital storytelling has undergone extensive study in different content-areas, but its naturally-combined use with collaborative writing for skills development, and reflective practice remains underresearched in pre-service EFL teacher education. This study undertook joint tech-enhanced retelling of L2 texts by 56 Turkish EFL teacher candidates, rubric-based peer and teacher assessment of final products, comparative analysis of complexity, accuracy, and fluency (CAF) between outliers, and process evaluation using the significant learning taxonomy to explore impacts on L2 writing performance, academic learning, and personal growth. Despite assigning lower scores than the teacher-assessors, especially to the top-performers, the majority of peers successfully fulfilled the job, effectively performed the future reviewer role, and positively reacted to co-construction, and technology integration. CAF and reflection analyses indicated that the biggest difference between the highestand lowest-scoring groups lay in grammatical accuracy, and lack of mutual interaction could account for the less cooperative group’s poorer performance. The classification of their post-task responses into six kinds of learning gains (foundational knowledge, application, integration, human dimension, caring, and learning how to learn) also revealed that their collaborative digital storytelling experience (CDS) elicited more procedural, critical, creative and practical thinking on the academic learning front, while disciplinary and integrative thinking may have declined due to more immediate preoccupation with task achievement. Their critical thinking was mainly organised around consensus-reaching, fluctuating membership, and logistical challenges, and most demonstrated a clear understanding of the role of positive group dynamics in group outcomes. Despite heightened awareness of the performanceboosting, character-forming, and motivational benefits of collective scaffolding and multimodal meaning-making, a minority could also discern the instrumentality of innovative teaching techniques in their future classroom practices.


Introduction
In an age of rapid, and even dramatic change, where 90% of the world's students can be most unwontedly left out of school due to a global disease outbreak, the concept of learning throughout life has recently acquired new significance, for it may now no longer be merely hailed as the saviour of disadvantaged adults, but perhaps rather be re-evaluated, in the fullest sense of the word, as holistic development for life. Within this renewed paradigm, success in life depends on learning to know (learning to learn; availing educational opportunities), learning to do (acquiring vocational and teamwork skills), learning to be (developing one's human potential; i.e. personality and independence), and learning to live together (developing an understanding of others; building tolerance and interdependence) (Tawil & Cougoureux, 2013).
In order for our youths to become fully-functioning individuals in their local communities, and active global citizens of a brave new world, we need to reimagine the classroom, diversify the learning activities, and extend their schooling beyond traditional chalk-and-talk lectures. Therefore, the use of technology in teaching and learning now matters maybe more than ever. The advent of the internet has made new instructional technologies more widely accessible, provided a myriad of choices for both teachers and learners, and in doing so, changed our understanding of content learning, too. Today, it is more than a body of factual knowledge to be disseminated by the teacher. Content is interactively co-created using different mediums, and digital storytelling is just one of many options we have in turning our classes into more collaborative, communicative, imaginative, and flexible learning spaces.
Digital storytelling plainly means "using computer software to tell a story" (Robin, 2016, p. 18). Essentially, it "integrates traditional and emerging literacies", i.e. "visual images with written text", and facilitates the understanding of content areas (Ohler, 2008, p. 45;Robin, 2008, p. 222). Despite having gained universal appeal in community organisations, it can be considered a fairly recent introduction to instructional technology (Robin, 2016;Sepp & Bandi-Rao, 2015). Born into a world of multimedia, our students are literally destined to expand their basic literacy skills to "navigate, interpret, design, and interrogate the written, visual, and design elements of the multimodal texts they encounter". (Serafini, 2014, p. 16). Digital storytelling may indeed serve as a transformative tool for developing these multiple literacies. If appropriately used, it can foster development of research, writing, organisation, technology, presentation, problem-solving, and people skills, which, in turn, lays the foundation for 21st. century competencies (Gregori-Signes, 2008;Robin, 2006).
For effective implementation, the following conditions should be met in digital storytelling projects: i. the link with the course content should be made explicit, ii. student performance should be assessed using rubrics, iii. contextual constraints (e.g. class time/size, mandated curriculum) should be carefully considered, and iv. selected software should provide only enough technical features for ease of student use (Hofer & Swan, 2006). As pre-packaged instruments may not suit the specific context, teachers are recommended to customise their own rubrics by "combining and rewording" proposed criteria (Frazel, 2010;Ohler, 2008, p. 67). It is also advisable to have students self-assess both their performances, and process of new media creation, since "examining the effect of the learning experiences they initiate" enables teachers to "become and remain effective teachers", and their students to "receive the best possible opportunities for reaching success in their learning" (Farrell, 2018a, p. 4;Farrell, 2018b, p. 5;Ohler, 2008;Wright, 2010, p. 288).
Notwithstanding its obvious connection with purposeful language use, and the wide range of benefits it offers in the foreign language classroom, most researchers have focused on investigating the implementation of digital storytelling in different content-area classes such as science and language arts/literacy (Castaneda, 2013;Castaneda & Rojas-Miesse, 2016;Kim, 2018;Reyes-Torres et al., 2012;Sepp & Bandi-Rao, 2015;Sevilla-Pavon et al., 2012). However, the current study postulates that teacher candidates may become more willing to incorporate progressive teaching techniques and technologies into their future classes if they are given the chance to go through and reflect on the learning experience themselves. It thus focuses on examining first-year ELT students' collectively-produced digital storytelling projects, their linguistic text features, and detailed user responses as indicators of overall task achievement, L2 writing performance, and reflective practice respectively, and identifying the significant learning outcomes for their teacher development. In the next section, we will first compare existing studies against certain key features such as participant groups, contexts, research designs, and treatments, then identify research trends and gaps in the particular context of digital storytelling use for L2 instruction, and finally formulate the research questions against this backdrop.
In many existing studies, digital stories as student products did not undergo assessment by teachers and/or peers, and their text characteristics almost always went unnoticed, even if length, embeddedness, and correctness were acknowledged as good predictors of L2 development. Among the studies cited here, only four (i.e. Alcantud-Diaz, 2016;Sepp & Bandi-Rao, 2015;Sevilla-Pavon et al., 2012;Soler-Pardo, 2014) reported teacher and student involvement in the evaluation process, although none but Soler-Pardo (2014) provided even a brief summary of overall results. We know of four other studies (i.e. Choo & Li, 2017;Hur & Suh, 2012;Kesli-Dollar & Tekiner-Tolu, 2015;Reyes-Torres et al., 2012), where digital stories got assessed by the teacher. All except Kesli-Dollar and Tekiner-Tolu (2015), who assigned numerical ratings to 10 teacher-chosen features, simply instantiated developmental changes with work samples. There was a strong tendency to rely instead solely on general proficiency exams, oral communication tests, and teacher-made multiple-choice and essay tests for measuring the learning from digital storytelling (e.g. Balaman, 2018;Kallinikou & Nicolaidou, 2019;Ramirez, 2013;Yang & Wu, 2012;Zakaria & Aziz, 2019). But for Kim's (2018) study, where the students, despite practising narrative writing in their digital stories, were tested with traditional argumentative essays, the linguistic features of their multimodal projects might have been overlooked in assessing L2 writing ability.
While most of these studies were predominantly occupied with supporting quantitative results, they did not employ a theoretical framework, intercoder agreement, and/or quantification in qualitative analyses of user comments as to what they liked/disliked about digital storytelling, and provided sketchy descriptions of a multifaceted intervention by exemplifying recurrent themes (e.g. increased skills/autonomy/motivation) in a few direct quotes, and condensing evaluative responses into dichotomous categories (e.g. positive/negative outcomes, student/practitioner response, initial/process concerns or overall perceptions) (e.g. Castaneda, 2013;Choo & Li, 2017;Kent, 2011;Kim, 2018;Ramirez, 2013;Sevilla-Pavon, 2015).
Consequently, Fink's (2003) taxonomy of significant learning was especially chosen in this study as the conceptual framework for the content analysis of the participants' responses because he developed a special language for depicting students' learning experiences as demonstrated in their reflective writing, and provided analytical clarity and interpretive depth for the qualitative researcher. According to Fink (2003), even though there are other important kinds of learning such as interpersonal skills, tolerance, character, leadership and communication skills, teachers traditionally focus on developing the cognitive domain as they formulate their course objectives, and fail to prepare learners for the 21st. century society even in higher education. For this reason, Fink (2003) identified six kinds of significant learning in the learner's life: foundational knowledge, application, integration, human dimension, caring, and learning how to learn.
Whereas foundational knowledge concerns the key information learners need to understand and remember, and application refers to the important skills they need to master (i.e. critical, creative, practical thinking, problem-solving, organizational skills), integration involves the connections they need to make between their subject and others, and along with the first two significant learning outcomes, forms the academic learning sphere (Barnes & Caprino, 2016;Fink, 2003). In the personal growth sphere, the human dimension concerns personal and social dimensions of their learning (i.e. learning about oneself, and others), and caring refers to their changing feelings, values, and interests about the subject, whereas learning how to learn involves developing their continuous learning plans (Barnes & Caprino, 2016;Fink, 2003).
Apart from lack of an analytical framework, another gap in existing research relates to combined use of digital storytelling and collaborative writing techniques in the L2 classroom. In their efforts to ensure multimodality from discourse design to production, digital storytellers create 'a community of practice', in which everyone, with more expertise in some element, "scaffold[s] each other's learning", so "joint enterprise" is inherent in every aspect of the digital storytelling process (Vinogradova et al., 2011, p. 180). Nevertheless, a small number of studies attempted to investigate the usefulness of collaborative digital storytelling (CDS), whereby not only resources, but responsibilities and ownership are also shared between group members (e.g. Ming et al., 2014;Ramirez, 2013;Reyes-Torres et al., 2012;Yang & Wu, 2012). Collaborative fiction writing emerged as "a natural field of application" in digital storytelling practices (Rubino et al., 2018, p. 883); however, even fewer studies involved retelling of common classroom texts such as adapted children's stories, and postmodern versions of classic fairy tales in L2 classes (e.g. Hanington et al., 2013;Kim, 2018;Soler-Pardo, 2014).
For the aforementioned reasons, unlike other studies, i. Turkish EFL student teachers were chosen as the study group, ii. localised/contemporised retelling of a previously-studied literary story was practised, iii. CDS performance underwent rubric-based peer and teacher assessment, iv. L2 writing performance was evaluated in terms of textual complexity, accuracy, and fluency (CAF), v. reflective writing was undertaken as part of process evaluation, and vi. Fink's (2003) taxonomy of significant learning was adopted as the theoretical framework for analysing quantitised data in this research. With the purpose of gaining deeper insights into the effects of CDS over their teacher development, the following research questions were formulated: i. How does the CDS experience influence EFL teacher candidates' L2 writing performance? ii. How does the CDS experience influence their academic learning and personal growth?

Method
As it employed two qualitative strands (text and reflection analyses), and both qualitative and quantitative approaches to data analysis through data conversion, this study had a multimethod design (Teddlie & Tashakkori, 2009). 56 freshmen (44% male, 56% female, aged 18-20) were purposefully selected among Turkish EFL teacher candidates studying at the department of foreign language education in a public urban university. Having completed their English preparatory program at B2-level, and possessing no prior experience in collaborative writing and digital storytelling, they provided information-rich cases capable of illuminating the research questions (Patton, 2002).

Data Collection
After engaging in textual activities (guided visualisations, character networks, literary dominoes, previews of film adaptations), the participants had three 50-min technical sessions, where the teacher demonstrated how to use digital storytelling resources (i.e. mystorybook, PhotoStory3, PowerPoint), and provided them with some exemplary projects found online. Then, in self-selected groups of four, they brainstormed, scripted, and picturised their localised/contemporised versions of the Hamlet story. At the end of the three-week period, they presented projects, and evaluated each other's with the rubric developed on the basis of related literature (Kesli-Dollar & Tekiner-Tolu, 2015;Sepp & Bandi-Rao, 2015). Both peer (30%) and two teacher assessors (70%) scored 14 projects on a scale of 1-4 (1: poor, 2: satisfactory, 3: good, 4: excellent) for five qualities: i. story content (inclusiveness of basic storyline), ii. language use (grammar and vocabulary in use), iii. visuality (illustrativeness of created images for narrative scenes), iv. organisation (coherence and cohesion), v. creativity (overall interestingness), and obtained a total score out of 100 by multiplying the sum of all categories by five. They also evaluated the CDS process in response to these prompts from the post-task survey: i. "What have you learnt from your CDS project? Describe the most important knowledge, skills, and values you have developed", and ii. "What challenges have you encountered in your CDS project? Describe the major problems you faced, and how you solved them". The qualitative data was derived from two types of documents, collective multimodal texts, and summative written reflections. The researcher informed them of the study purpose, maintained anonymity by assigning case numbers (G1, ST1), and in protecting their confidentiality, endeavoured to improve consent, candidness of self-reports, and data quality (Miles & Huberman, 1994).

Data analysis
Their CDS projects were subjected to both qualitative and quantitative analysis. After double-marking, two experts assigned average scores. The interrater reliability was calculated by dividing the number of times they agreed on the same score by the total number of times they judged student performance. Since no difference was found between their ratings in 54 out of 70 cases, an acceptable level of 77% agreement was obtained (Stemler & Tsai, 2008). They also identified T-units (independent plus subordinate clauses), subordinate (adverbial, adjectival, nominal) clauses, and errors (except mechanical ones) in the highest-and lowest-scoring projects. The interrater reliability was calculated as 97% for clause identification, and 91% for error identification. To determine how language use varied in relation to their performance feedback, the high-and low-outlying texts were compared against CAF measures by dividing the total number of: i. clauses, ii. error-free T-units and errors, and iii. words by the total number of T-units (Wolfe-Quintero et al., 1998).
During the content analysis of the post-task responses, two coders independently identified the salient patterns in the participants' reflective statements with respect to Fink's (2003) conceptual framework. After that, they compared the preliminary results, discussed their discrepancies, and reached consensus on the final coding. The intercoder reliability for the reflective statements was calculated as 94% by using Miles and Huberman's (1994) formula (i.e. [reliability = agreements / (agreements + disagreements)]). In order to increase reliability and validity of the findings: i. the qualitative data was quantified, ii. both descriptive statistics (i.e. frequencies, and percentages) and exemplars were provided in appropriate tabulations, iii. cross-categorical comparisons were undertaken, iv. detailed descriptions of the study group and setting were given, v. the respondents were invited to confirm the tentative results, vi. no changes were made to their original statements, vii. all parts of the data were exhaustively examined, and analysed, and viii. where necessary, direct quotes were employed to amplify the interpretations (Creswell, 2007;Silverman & Marvasti, 2008).

Findings
The present study sought to investigate: i. EFL teacher candidates' overall CDS performance (task achievement), ii. the variation of language use (complexity, accuracy, fluency features of their texts) between the highest-and lowestperforming projects (extreme cases), and also iii. the distribution of their self-reported academic learning (as displayed in their reflections on foundational knowledge, application, and integration), and personal growth gains (as displayed in their reflections on the human dimension, caring, and learning how to learn). In this section, the findings are presented in line with the two aforementioned research questions.
In response to the first research question, EFL teacher candidates' 14 CDS projects were subjected to rubric-based peer (30%) and teacher (70%) assessment, and the combination of their weighted scores provided the below results on their overall CDS performance. According to Table 1, four of 14 groups (G1-G2-G3-G9) exhibited exemplary performance, as their work embodied derivative originality, a well-structured plot, representative images, and correct use of narrative tenses and cohesive devices. Despite difficulties in creating interest, choosing accurate language forms, and illustrating the basic storyline, six groups (G5-G10-G11-G12-G13-G14) showed good performance, whereas another four (G4-G6-G7-G8) received a passing grade for acceptable performance. When the form of assessment changed, their mean scores rose by 7% from 64.21 to 68.78, and they finally scored an average of 67.41. It was also observable from Table 1 that although peer and teacher ratings did not vary greatly, ranging at most from +/-1 to 7 points in 10 instances, STs as novice assessors indicated a greater tendency to give lower marks, especially when scoring the top-performing groups. Their joint work faced a harsher criticism from peers, and the range of the rating discrepancy varied from 11 to 21 points in outstanding projects (G1-G2-G3-G9). After determining the highest-and lowest-scoring projects among 14 CDS projects, the present study focused on comparing their two outlying texts against the linguistic features of complexity, accuracy, and fluency (CAF). The comparison of their CAF measures in Table 2 revealed that G2 produced a longer text (W/T=9.71) with fewer errors (E/T=0.17; EFT/T=0.86) by making slightly more use of clause complexing (C/T=1.59). They made only occasional errors in tense shifts (e.g. "He was mesmerized from the very first page. The book is>was about"), and lexical restrictions (e.g. "H suggested him to tell>suggested telling C"), whereas G7's text was afflicted by persistent problems with the use of more discrete elements such as the articles, prepositions, (e.g. " [The] shepherd provoked the cows by>with K's sign. [The] provoked cattle started to run"), adverbs of manner (e.g. "K was acting weird>weirdly"), and verb inflections (e.g. "Her brother run>ran to [the] bride's room, and saw a letter [with] his name on it").
It was evident from the presence of reduced adverbial and relative clauses in G7's text (e.g. "Learning [about] his father's death, O, H's betrothed in the cradle, was grieved") that they could produce grammatically complex utterances, but were outperformed by G2 due to lack of attention to details across consecutive sentences (e.g. "By the time L attacked H, K hold>had held him, and they started to fight. L stabbed K from>in his leg. K had hold>held the knife, and took it off>out. K had stand>stood up on his other leg"). Since group work was assumed to enable pooling of resources, and contribute to fluency and accuracy, their performance feedback could grasp the true nature of collaboration between members. Blaming each other for poorer performance, ST4 ("perfectionist") and ST8 ("lazy and reckless") yearned for independent study. ST48 admitted having "a set of communication problems", whereas "non-proportional distribution of roles" was ST5's real culprit in G7.
In response to the second question, the post-task responses collected from these Turkish EFL teacher candidates were subjected to content analysis, and the classification of their summative reflective statements into Fink's (2003) six significant learning outcomes demonstrated in Table 3 that despite strong inclination to analyse and critique context conditions (44.7%), references to CDS benefits accounted for over half of their reflective writing (55.3%). In comparison with the amount of attention they paid to their conceptual understanding of narrative structure, they adopted a more explanatory approach to identification of contextual factors influencing task achievement. While they managed to modify the available artwork and/or create their own, limitedness of web image galleries (11%) ranked higher amongst situational challenges than consensus-reaching (9%), whose potential for generating a wider range of ideas was simultaneously embraced as in ST22's words: "Different ideas between friends was a problem as well as good supply for creating a story. For example, one wanted to make Hamlet optimistic, and another opposite".  Becoming a self-directed learner: ST12: Picture stories prepared us to the types of work we will be doing when we start our teaching careers. We learned how to make a pictured story for our children.

Total 610 100
Thirdly, their critical comments concentrated on social loafing and free-riding behaviours (7%), where one either invests less effort, or totally neglects his duties, and causes another to work under pressure: "I was the only person who tried to make it better. It was hard to do the pictures and translation alone. I had to finish in time, and I was stressed" (ST29). Lack of previous knowledge on basic computer skills was subsequently concluded to disturb group dynamics (6.3%), as those with a flair for technology tended to undertake picturisations: "Some of my team even didn't know about using computer and internet. It makes others do more job. The separation of duty can be unfair and sometimes one of us have more things to do" (ST44). As in the examples from Table 3, 5% referred to organisational issues (group gatherings), and 3.4% mentioned a shortage of computers and high-speed internet. Seeming unconvinced of the quality of their translational choices, 1.6% expressed regret over composing in L1: "We couldn't agree on some translations. Everyone didn't do their job carefully. We had to change them multiple times" (ST43). Peer-editing was another area, where discord grew among group members due to stylistic variations (1.4%). ST4 recounted incidents of losing face with a "rigorous" group member as follows: "I felt on a knife-edge. She wanted everything to be perfect. However, nobody is perfect. I prepared some pictures and sentences, and I did some mistakes. She felt bad, and I felt sad".
Apart from critical reflection, they reported developing important technological, linguistic, and organisational skills for their task performance (5.4%). They appreciated the experiential learning opportunities for improving computer literacy: "I managed to develop my photoshop skills with my friend's help, so I can make high quality pictures now" (ST6). ST13 regarded it as a pleasant diversion from the accustomed mechanical practice of L2 items: "Creating something in a lesson wasn't something students generally do. While we were writing the story, we learned using new words and grammar rules together. Doing this assignment helped me progress my English". Creative (4%) and practical thinking (0.4%) were cited along with their more dominant critical thinking behaviours. The majority of these responses underscored the stimulating effect of group brainstorming, whereas others remarked on the role of images in telling anew "a Shakespeare story": "Making picture stories let us use our imagination. Our creativity got greater and we learned to think unique" (ST3). Their practical thinking merely involved troubleshooting technical issues, though.
Even though they were encouraged to make connections between home and target cultures during their short story study, these STs did not profess themselves acquainted with Fink's (2003) third type of significant learning, integration (0%), which requires even higher-order thinking skills. Their failure to indicate interactions between the course material and learning of different perspectives might be attributed to prioritised concerns about task fulfilment rather than cultural learning per se. In addition to more academically-oriented types of significant learning goals, they addressed the human dimension (f=101, 16.5%), personal and social implications of learning quite frequently. Like ST18 in Table 3, several others stated that they developed interpersonal skills, and positive personal qualities (7.8%) such as confidence, responsibility, tolerance, and respect: ST7: "I was getting along with all my friends except one. I could have killed her. However, I learned respecting her ideas and using them with my ideas. This group work taught me to communicate with people calmly".
ST24: "I felt responsible to my friends like we are in a body. I learned I should do whatever was given to me in a specific time. If I did this work on my own, I'd postpone it over and over".
ST33: "I recognised that I was good at drawing and technological stuff. This gave me courage to achieve more".
A closer look at what they wrote about understanding others (8.7%) demonstrated that like ST19 in Table 3, they were aware of the importance of rapport-building, and joint efforts in achieving shared goals, and positive interdependence was eventually established during team meetings. For ST39, their success "depended on whole individuals' creativity and ability", and they were feeling "forced to [empathise with] each other", whereas ST41's group members maintained harmony by assuming "a humble attitude" respectively.
As to the kind of changes brought about in their attitudes (12%), two positive thought patterns were foregrounded as in Table 3: enjoyment and satisfaction they had of using innovative ways of storytelling. They especially liked combining the traditional written word with self-chosen/modified digital images: "Although designing characters made us tired, we were never bored of doing it because we didn't work just on words but pictures and colours" (ST21). ST38 exemplified how engaging joint authorship could get compared to classical narration: "I had so much fun with my friends. There were so many crazy ideas that we could die laughing. We had a book at the end. It belonged to us. It was a good feeling".
Our STs ultimately focused on how one can become a better student, and regulate one's own learning process (f=92, 15%). They provided three main rationales for favouring collaborative and multimodal approaches to L2 writing (13%): i. ease of task completion (ST49: "It was much easier to work in group because everyone had a responsibility instead of one person having all responsibilities"), ii. improved task outcomes (ST35: "One of us was good at vocabulary. One was good at grammar, and one was good at using technology. That helped us a lot. We had a better result"), and iii. effective expression through digital imagery (ST5: "Using picture story programs was useful. After writing the event, giving the picture is the best descriptive thing"). According to ST20 and ST26, task division and peerediting empowered them to write and illustrate a full-length narrative in a shorter period of time, and with greater accuracy. ST16 and ST42 also drew attention to the role of effective group composition in group outcomes, and warned that friendship groups may turn these advantages into disadvantages, if a group member tries to free-ride on the efforts of others: ST16: "It taught me choosing friends wasn't unimportant thing. They took you up or down. I'd never choose them randomly next time. Despite all, I was happy being a part of group. It gave me power to do that assignment".
ST42: "I should be prepared for all parts of work because if a member doesn't perform his duty, this effects all members. We all get a low mark, not only that person. My group friends are my best friends but I deduced work is one matter and friendship other".
Other participants like ST8 supported the use of technology for enlivening both their texts, and compositional process with illustrations: "Making a project with technology taught me that not every writing homework is boring. We can enjoy it if we use technology. We can describe more efficiently what we want to say or directly show it". Like ST12 from Table 3, only a few (2%) came to the realisation that they were introduced to new technological tools, and collaborative writing techniques to revamp what might otherwise be replaced with controlled practice activities, or abandoned altogether in the traditional Turkish EFL classroom. ST37 stated that "doing conventional and same things with paper and pencil" no longer felt "enjoyable", and was consequently inspired to "share this interesting experience with [his] students" in the future.

Discussion
In the light of the findings from the outcome (i.e. peer and teacher assessment of CDS projects and CAF analysis of outlying texts) and process (i.e. content analysis of post-task responses according to the significant learning taxonomy) evaluations, the greater majority of these Turkish EFL teacher candidates can be claimed to successfully fulfil their CDS task, undertake future reviewer role effectively, and respond positively to co-construction and technology integration. Despite slight variations in the frequency of higher peer ratings, and discrepancy ranges, the student assessors here showed the same tendency as Soler-Pardo's (2014) to give lower scores than their teacher counterparts, while the topperforming groups received harsher assessments as in Steverding et al.'s (2016) study of peer-marking in oral presentations. The minor difference between peer and teacher mean scores conveyed that they understood evaluative criteria well, and managed to evaluate academic products appropriately (O'Neill & Morcke, 2016). This might be because: i. the use of unequal weighting (70-30%), letting them substantially contribute to their final grades made peer assessment more meaningful, and ii. the simulated teaching experience, participating in rubric-based assessment of each other's work could have stimulated them to proper execution (Onyia, 2014;Saito & Fujita, 2004;Soler-Pardo, 2014).
When results from CAF and reflection analyses were evaluated together, the biggest difference between the highestand lowest-scoring projects was found in grammatical accuracy rather than syntactic length and complexity. In spite of having similar language proficiencies, the less cooperative group's failure to notice grammatical errors can be explained by weaker group awareness. They apparently liked to work on different parts (sequentially) at different times (asynchronously) in different locations (distributedly), avoided face-to-face conversations outside the classroom, and diminished their chances of consensus and control over their CDS projects; therefore, as Lowry et al. (2004) pointed out, poor communication and weak relationships might have led to their poorer performance. Several previous studies on CDS activities had also documented positive overall results, attempts for complex structures, relatively fewer grammar errors, and skilful use of technology, mainly through teacher-led global evaluations (Alcantud-Diaz, 2016;Ramirez, 2013;Reyes-Torres et al., 2012;Sevilla-Pavon et al., 2012;Soler-Pardo, 2014), but Kim's (2018) CAF analysis alone showed that despite no significant development in fluency and complexity, multiple practices of peer-editing increased L2 writing accuracy, when partners worked well together towards the same end.
Apart from professional (assessor) role and L2 writing development, the present study aimed to engage the participants in some form of heuristic evaluation. Just as UX (user experience) designers test a website's userfriendliness before end-users, so prospective English teachers can get hands-on experience of collaborative and technology-integrated education, and having seen for themselves, can be better positioned to weigh all factors involved in making the decision to implement the same progressive teaching techniques. On the academic learning front, this CDS experience seemed to have elicited more procedural, critical, creative, and practical thinking than disciplinary and integrative thinking from the participants. Like Rubino et al.'s (2018) 43 Italian secondary school students, who cowrote the prequel and sequel of a novel by using a digital storytelling platform, this study group demonstrated a conceptual understanding of the mechanisms of narrative writing.
As for the development of more advanced integration learning goals such as intercultural awareness, highlighted crosscultural similarities and differences remained unaddressed in their responses. Given the comparative focus of the assigned task, this finding stood in direct contrast to that of Alcantud-Diaz (2016), where 48 pre-service EFL teachers displayed social awareness through their CDS projects on the Syrian refugee crisis. Hamilton et al. (2019), who examined study abroad reflections of American undergraduates, manifested in 25 of their digital stories, nonetheless, similarly discovered that although they were mostly able to articulate their content knowledge about progress and sustainability issues, and organise different forms of media into a cohesive presentation, they failed to clearly show curiosity and understanding of other worldviews. To achieve the desired transition from rote to deeper learning in this study, the learners' conscious attention could therefore have been more systematically focused on the links between their theoretical and experiential knowledge.
The dominance of application comments over those of foundational knowledge and integration also pointed to greater preoccupation with task performance. They appeared to care more about putting into action newly-acquired generic and technological knowledge, exploring task management skills, and critically evaluating task decisions and conditions during teamwork. Congruent with their predecessors' self-reports, the participants corroborated the following benefits of digital storytelling in collaborative groups for: i. improved language skills (Kent, 2011;Kim, 2018;Ming et al., 2014;Oskoz & Elola, 2014;Ramirez, 2013 As revealed by past research and current reflective analysis, the essence of the CDS experience lies in one's capacity to amplify the learning resources and modalities. With the incorporation of collaborative work and digital technologies, the learners' choices in neither acquisition nor exposition of L2 competence were now delimited to the good old coursebook, mostly monologic teacher-talk, dull worksheets, or the traditional essay. Our STs mainly favoured learning with and from more proficient, tech-savvy peers, and the internet, overcoming language and other challenges with the help of group intelligence, bridging the gap between their lexicogrammatical repertoire and authentic contexts of use, and facilitating reading comprehension through verbal and visual support. Since the use of technology, visual and/or other extralinguistic cues allowed compensation for language limitations and flexibility in the work mode, LEP students also expressed their liking for digital storytelling circles (Bandi-Rao & Sepp, 2014;Castaneda, 2013;Smeda et al., 2014). As the old cliché goes, variety is indeed the spice of the learners' lives, and in the end, it all comes down to liberating them from the four-walled classroom, and inhibited self.
Although documented constraints facing similar applications primarily related to inadequate provision of technological literacy, and infrastructure, time-consuming nature of the digital storytelling process, student resistance, and misplaced attention to visuals, this study expanded on previous research by identifying and ranking the factors Turkish EFL teacher candidates as both L2 learners and observant apprentices perceived to influence their decision-making and task achievement in the particular practice setting (Alcantud-Diaz, 2016; Bandi-Rao & Sepp, 2014;Genereux & Thompson, 2008;Kent, 2011;Sadik, 2008;Sepp & Bandi-Rao, 2015;Vinogradova et al., 2011). The almost equal distribution of their critical reflections between technology-and group-related constraints suggested that besides being aware of the importance of prior computer knowledge base, and computing facilities, they also became aware that positive group dynamics can ameliorate the negative consequences of resource scarcity, and their absence, on the other hand, hinder efforts to achieve the best possible outcome. While the most immediate difficulty of finding appropriate images was already touched upon in three other studies, these STs overcame website limitations by creating custom images (Chiang, 2020;Genereux & Thompson, 2008;Ramirez, 2013). Having group members without basic computer skills was again considered problematic, although expertise and assistance were sought from more adept partners in the face of technological ignorance and deficiencies (Bandi-Rao & Sepp, 2014;Ramirez, 2013;Sadik, 2008;Sepp & Bandi-Rao, 2015;Vinogradova et al., 2011).
As demonstrated by the higher frequency of critical comments on issues of consensus and task division, the respondents struggled to negotiate over a likeable scenario, since they were used to composing in solitude, and appealing to the teacher's taste alone. Even though the resultant diversity of ideas were welcomed in their collaborative groups, there were some that could not sustain their early enthusiasm beyond brainstorming. As documented by three consecutive studies, where similar undergraduates from the American, Greek, and Japanese contexts were criticised for not contributing their fair share, the social loafing phenomenon was not uncommon among these STs (Arnold et al., 2012;Karasavvidis, 2010;Mulligan & Garofalo, 2011). In contrast to the general assumption that self-selected groups are better at managing interpersonal conflicts, and can boost performance, letting them choose their teammates did not resolve the free-rider problem here but rather proved able to complicate even the simplest task of scheduling group meetings outside the classroom (Davis, 2009). In Elola and Oskoz's (2010) study, eight Spanish majors preferred individual to collaborative writing for the same reason that they disliked disagreeing with co-authors, being dependent on other people's input, and having to negotiate meeting times to get the job done.
Besides fluctuating membership, and logistical challenges, a small minority experienced language-related difficulties in selecting appropriate L2 equivalents, and making revisions to each other's contributions. Contrary to Sevilla-Pavon et al.'s (2012), and Lawrence and Wah's (2016) studies, where weaker learners resorted to online translation tools, or got more proficient partners to grammaticise their L1 meanings, the collective appeal of our independent users to translated rather than direct writing can be related to the natural urge to access and convey richer content through thinking ideas in L1, and reformulating them into L2 together. Fung (2010), too, identified L1 use as a facilitating factor for developing ideas, checking word meanings, and overcoming working-memory limitations. Yet, paradoxically enough, expertise sharing, which was supposed to ease text production, and alleviate L2 writing anxiety, left them worrying about mutual language choices. Just as other adult L2 learners in various studies expressed reservations about peer-editing because they wanted neither to hurt their friends' feelings nor lose face, ours also acknowledged disagreements and discomfort among learning partners due to lack of confidence in their own and others' language skills (Elola & Oskoz, 2010;Karasavvidis, 2010;Mulligan & Garofalo, 2011;Storch, 2005). In addition to social loafing, reduced sensitivity to grammatical inaccuracies might be linked to group formation out of close friends. It was thus considered that assigning fewer members, assessing partners, reshuffling groups on the basis of peer evaluations could have increased commitment, and harmony in such affinity-based self-selected groups (Arnold et al., 2012;Karasavvidis, 2010;Mulligan & Garofalo, 2011).
In embracing and learning from their challenges, these STs, all soloists formerly competing in individualist, teacher-led classrooms reported developing certain positive character traits, including confidence, responsibility, tolerance, and respect (Bandi-Rao & Sepp, 2014;Castaneda & Rojas-Miesse, 2016;Chiang, 2020;Kent, 2011;Mulligan & Garofalo, 2011;Ramirez, 2013;Sepp & Bandi-Rao, 2015), along with interpersonal and social skills for maintaining rapport, empathy, and trust in their teams (Ming et al., 2014;Mulligan & Garofalo, 2011;Reyes-Torres et al., 2012;Sepp & Bandi-Rao, 2015;Sevilla-Pavon, 2015;Soler-Pardo, 2014;Vinogradova et al., 2011), and consequently, the current study extended previous findings on the personal growth gains from the combination of collaborative writing and digital storytelling. In line with previous studies, these STs realised that through mutual interaction, they could assist one another with language and technical issues, and achieve more by combining strengths than working individually (Fung, 2010;Sevilla-Pavon, 2015;Smeda et al., 2014;Storch, 2005;Vinogradova et al., 2011). They were additionally aware that apart from language choices and task procedures, they needed to negotiate relations between partners, and just as cognitive conflicts were inevitable during consensus-building, and necessary for broader perspectives and better outcomes, unresolved conflicts (personality clashes) could destroy team cohesion, and performance (Fung, 2010).
The CDS experience also inspired positive emotions like satisfaction and enjoyment, and their self-reports highlighted the catalytic roles of collective authorship, and digitised picturisations in creating a conducive atmosphere for further engagement. These results supported prior research regarding teachers' and students' positive attitudes towards similar applications across a wide range of teaching-learning situations (Chiang, 2020;Kesli-Dollar & Tekiner-Tolu, 2015;Ming et al., 2014;Ramirez, 2013;Sadik, 2008;Sevilla-Pavon et al., 2012;Smeda et al., 2014;Soler-Pardo, 2014;Zakaria & Aziz, 2019). Except the stimulating blend of linguistic and extralinguistic resources, the participants of both this and earlier research explained their favourable reception by relating to the sense of accomplishment gained from completing a challenging task, and genuine feelings of pride and ownership evoked by their joint products (Castaneda, 2013;Fries-Gaither, 2010;Kent, 2011;Kesli-Dollar & Tekiner-Tolu, 2015;Sadik, 2008;Sepp & Bandi-Rao, 2015). As in few previous instances, where amusement, besides engagement, was elicited from students, these STs found their CDS activities interesting and fun in comparison to traditional writing assignments, and reminisced especially about how they laughed over brainstormed alternatives during screenwriting (Boase, 2008;Choo & Li, 2017;Fung, 2010;Genereux & Thompson, 2008;Reyes-Torres et al., 2012).
The evaluation of their critical comments about learning how to learn finally revealed that the current participants, though not as a whole, knew from experience how shared workload and expertise can accelerate task performance, and upgrade the quality of co-products, and why multiple modalities matter in meaning exchange, but at the same time, accepted the group structure as an enabling condition for such collective scaffolding. Even in most online learning environments, collaborative group work and classroom interaction could not be left out for the sake of student engagement and learning (Gonzalez & Moore, 2020). Our findings thus confirmed previous student reports of time-and effort-saving benefits, improvements in content, organisation, and grammaticality, and enhanced communication via multimodal meaning-making tools (Elola & Oskoz, 2010;Hanington et al., 2013;Kesli-Dollar & Tekiner-Tolu, 2015;Kim, 2018;Mulligan & Garofalo, 2011;Oskoz & Elola, 2014;Sepp & Bandi-Rao, 2015;Sevilla-Pavon et al., 2012). They were also cognisant of the fact that as opposed to just plain sense reading of the model text, mimicking its narrative structure, and submitting individual stories to the teacher for scoring as is often the case with most Turkish EFL learners, the writing partners were here given free rein to recontextualise the original story in a contemporary/local setting, and experiment with more exciting, creative, and innovative mediums of storytelling. Numerous studies likewise accentuated the allure of technology, extolled its potential to change monotonous and formal "schoolwork" into powerful aesthetic experience, and asserted that the value the practitioners accorded to digital storytelling was also partly due to the "novelty effect" in their visualisation of what they read and write (Boase, 2008;Chiang, 2020;Gunter, 2012;Kent, 2011;Kesli-Dollar & Tekiner-Tolu, 2015, p. 180;Kim, 2018, p. 83;Soler-Pardo, 2014).
Their post-task responses reflected a keen awareness of the benefits and challenges of co-construction and technology use in their L2 narration, as well as ultimate attainment of self-direction encouraged through increased efficiency and enthusiasm (Bandi-Rao & Sepp, 2014;Choo & Li, 2017;Kent, 2011;Ming et al., 2014;Ramirez, 2013); however, not many of these STs could further recognise the importance of acquiring an insider's perspective on the use of such participatory learning activities. Maddin's (2012), Hanington et al.'s (2013), Soler-Pardo's (2014), and Alcantud-Diaz's (2016) prospective L2 teachers, too, indicated developing firsthand information about the educational value, and implementation of collaborative writing and digital storytelling processes. But having been similarly discontented with their display of pedagogical awareness, Hanington et al. (2013) proposed increasing the explicitness of links between task-based activities, and promoted methodologies to ensure their learning transfer. As also recommended by Genereux and Thompson (2008), the current participants might have been guided into further thinking on potential areas for future application of this knowledge through additional reflection questions, and therefore, could have been better assisted in transcending immediate learning goals, and devising their personal professional development plans.

Conclusion and Suggestions
According to the results of rubric-based peer and teacher assessment of their CDS projects, comparative CAF analysis of the two outlying texts, and Turkish EFL teacher candidates' reflective self-reporting on the overall CDS process, it was revealed that: i. despite marking top-performing groups more harshly, they successfully completed the CDS task, performed their future reviewer role effectively, and responded positively to co-construction, and technology integration; ii. since the greatest difference between the highest-and lowest-scoring texts lay in grammatical accuracy, diminished interaction between partners could account for the less cooperative group's poorer performance, iii. the CDS experience elicited more procedural, critical, creative, and practical thinking than disciplinary and integrative thinking due to more pervasive preoccupation with task achievement, iv. their critical thinking concentrated on consensus-reaching, fluctuating membership, and logistical challenges, and also indicated a clear understanding of the role of positive group dynamics in group outcomes, v. despite heightened awareness of the performance-boosting, character-forming, and motivational benefits of collective scaffolding and multimodal meaning-making, only a minority could but discern the instrumentality of progressive teaching techniques in their future classroom practices.
On the basis of the available evidence, the use of collaborative digital stories can be concluded to have been validated among these pre-service EFL teachers as an emancipatory educational tool for surpassing the achievements of the lone L2 user/teacher, alternating between the traditional and 21st. century modes of effective language and lesson delivery, and fostering their holistic development as actual experiencers/future agents of innovative classroom practices. It remains to be seen whether and to what extent: i. longer durations of tool-use training, ii. a dual use of ongoing and summative reflection, iii. more deliberate explicit focus on the significant learning goals, or iv. adopting different tools, grouping structures (e.g. fixed/mixed-ability), and control/work modes can reduce student-perceived concerns about task readiness, technical support, and division of labour, and induce improved teaching/learning performance in the L2 classroom.
As Benmayor (2008, p. 188) appropriately summed up in three words, this "assets-based social pedagogy" is here to stay. Due to the superior effect of the long-standing personal observations, and experiences from their student learning, teachers may nevertheless prefer teaching as they were once taught to enacting varied methodologies their faculties have trained them for (Bailey et al., 1996;Peacock, 2001). For this reason, it becomes even more important to get preservice teachers to live the real thing, reflect on their actions, and recreate their lived experiences, if we expect to increase their chances of integrating new instructional techniques and resources into their teaching (Guikema & Menke, 2014;Kajder, 2005).
There is also an urgent need for curricular revisions to increase contact and familiarity in signature pedagogies. New findings on perceptions of digital literacy highlight that despite strong belief in its vitality for their teacher development, many prospective teachers still continue to associate digital literacy with complexity (Dedebali, 2020). In today's world, where the focus has recently shifted in higher education from digital literacy to digital fluency (a much more sophisticated competence to use digital tools for content co-creation, creative design, and adaptive problemsolving), the teacher is no longer conceived as knowledge-transmitter but rather facilitator and curator, and the overwhelming necessity of redesigning instruction accordingly (i.e. creating "digitally rich learning environments and pedagogically sound learning experiences") is emerging as a wicked challenge for all subject-matter instructors, deprived of equal access to technology and instructional design expertise (Alexander et al., 2019, p. 15).
Perhaps more than in any other discipline, pre-service L2 teachers need to learn -by concrete experience, active experimentation, and conscious reflection -how to repackage the language-/subject-specific content they have mastered in multiple formats, so that they can get their messages across clearly to the target audience, whether composed of fellow student teachers, teacher educators, cooperating teachers, and pupils at practicum schools, or an even wider circle of global English users. To this end, we finally suggest that instead of restricting technology use to one-time add-on courses to pre-service teacher education programs, all stakeholders should take a technology-acrossthe-curriculum approach, and maximise the utility of such technological resources as collaborative digital stories in developing EFL teacher candidates' language and pedagogic skills, and also seasoning their lectures with similar multigenre projects.

Limitations
Instead of making generalisations from the results, this study set out to develop deeper insights into the impacts of collaborative digital storytelling process on Turkish EFL teacher candidates' L2 writing performance, academic learning, and also personal growth; however, the relatively small size of the study group, their proficiency level (upperintermediates), and focus on extreme (i.e. highest-and lowest-scoring) cases during text analysis might be listed as the limitations of the study. Apart from outcome assessments and self-reported knowledge, future researchers can also employ experimental mixed-methods designs in order to verify the effectiveness of similar applications on different learner profiles and proficiency groups.