
Corpora developed by linguists to study languages are a promising source of authentic materials to employ in the development of OER for language learning. Recently, COERLL’s SpinTX Corpus-to-Classroom project launched a new open resource that seeks to make it easy to search and adapt materials from a video corpus.
The SpinTX video archive provides a pedagogically-friendly web interface to search hundreds of videos from the Spanish in Texas Corpus. Each of the videos is accompanied by synchronized closed captions and a transcript that has been annotated with thematic, grammatical, functional and metalinguistic information. Educators using the site can also tag videos for features that match their interests, and share favorite videos in playlists.
A collaboration among educators, professional linguists, and technologists, the SpinTX project leverages different aspects of the “openness” movement including open research, open data, open source software, and open education. It is our hope that by opening up this corpus, and by sharing the strategies and tools we used to develop it, others may be able to replicate and build on our work in other contexts.
So, how do we make a corpus open and beneficial across communities? Here are 5 ways:
Minimize barriers to your content. Searching the SpinTX video archive requires no registration, passwords or fees. To maximize accessibility, think about your audience’s context and needs. The SpinTX video archive offers a corpus interface specifically for educators, and plans to to create a different interface for researchers.
Add a Creative Commons license to your corpus materials. The SpinTX video archive uses a CC BY-NC-SA license that requires attribution but allows others to reuse the materials different contexts.
Allow others to easily embed or download your content and data. The SpinTX video archive provides social sharing buttons for each video, as well as providing access to the source data (tagged transcripts) through Google Fusion Tables.
When possible, use and build upon open source tools. The SpinTX project was developed using a combination of open source software (e.g. TreeTagger, Drupal) and open APIs (e.g. YouTube Captioning API). Custom code developed for the project is openly shared through a GitHub repository.
Make it easy for others to replicate and build on your work. The SpinTX team is publishing its research protocols, development processes and methodologies, and other project documentation on the SpinTX Corpus-to-Classroom blog.
Openly sharing language corpora may have wide-ranging benefits for diverse communities of researchers, educators, language learners, and the public interest. The SpinTX team is interested in starting a conversation across these communities. Have you ever used a corpus before? What did you use it for? If you have never used a corpus, how do you find and use authentic videos in the classroom? How can we make video corpora more accessible and useful for teachers and learners?
—
Rachael Gilg is the Project Manager and Lead Developer for COERLL’s Spanish in Texas Corpus project and the SpinTX Corpus-to-Classroom project. She has acted as project manager, designer, and developer on a diverse set of projects, including educational websites and online courses, video and interactive media, digital archives, and social/community websites.
Carl Blyth says
Great post, Rachael. And what a fantastic OER! Kudos to the SpinTX team. I can’t wait for Spanish teachers to discover this amazing resource.
I’d like to mention that COERLL will be presenting SpinTX at several conferences this summer (CALICO in Honolulu, AATSP in San Antonio) as well as a free webinar on June 26th. Register for the June webinar by going here:
http://coerll.utexas.edu/coerll/event/june-webinar-series-focus-spintx
Btw, I want to draw readers’ attention to two Canadian video archives for learning French:
1. “Francotoile” from the University of Victoria (BC)
http://francotoile.uvic.ca/
2. ‘Vidéotech” (Carleton University, Ontario)
https://video-tech.ca/
Francotoile presents French as an international language with clips of native speakers from 5 continents. You can easily download all videos to your hard drive!
Vidéotech contains NS videos and NNS videos. It also allows you to create activities based on easy-to-use templates.
Both resources feature many of the elements that Rachael cites for “opening up foreign language learning,” including Creative Commons licenses.
Andrew Weiler says
Seems like a worthy endeavour. I would like to add that it is important to be mindful of the learning process and let that drive any developments. Too often developments in this field lose track of the fundamental of what is required for a language learner to learn. One practical suggestion is to consider game theory ( what drives the phenomenal addictive power of video games) as a way to integrate what we know about successful use of this medium into learning
Rachael says
Excellent point Andrew! Since educators and language learners are our primary stakeholders, their needs should drive the design process. We started our project by interviewing educators and developing a needs assessment that informed our design. In addition, we are trying to follow a lean startup approach, which means getting a simple version of the tool launched early and adding new features incrementally based on user observation and feedback.