Open Up

Conversations on Open Education for Language Learning

  • Home
  • About
  • Archive
  • COERLL website
    • About COERLL
    • COERLL Projects
    • COERLL Publications
    • COERLL Materials
    • COERLL Events
    • OER
      • OER Glossary
      • OER Intitiaves
      • COERLL OER Wiki
  • Donate to COERLL
  • Submissions
  • Contact Us

5 Ways to Open Up Corpora for Language Learning

By Rachael Gilg

May 15, 2013 3 Comments

Corpora developed by linguists to study languages are a promising source of authentic materials to employ in the development of OER for language learning. Recently, COERLL’s SpinTX Corpus-to-Classroom project launched a new open resource that seeks to make it easy to search and adapt materials from a video corpus.

The SpinTX video archive  provides a pedagogically-friendly web interface to search hundreds of videos from the Spanish in Texas Corpus. Each of the videos is accompanied by synchronized closed captions and a transcript that has been annotated with thematic, grammatical, functional and metalinguistic information. Educators using the site can also tag videos for features that match their interests, and share favorite videos in playlists.

A collaboration among educators, professional linguists, and technologists, the SpinTX project leverages different aspects of the “openness” movement including open research, open data, open source software, and open education. It is our hope that by opening up this corpus, and by sharing the strategies and tools we used to develop it, others may be able to replicate and build on our work in other contexts.

So, how do we make a corpus open and beneficial across communities? Here are 5 ways:

1. Create an open and accessible search interface

Minimize barriers to your content. Searching the SpinTX video archive requires no registration, passwords or fees. To maximize accessibility, think about your audience’s context and needs. The SpinTX video archive offers a corpus interface specifically for educators, and plans to to create a different interface for researchers.

2. Use open content licences

Add a Creative Commons license to your corpus materials. The SpinTX video archive uses a CC BY-NC-SA license that requires attribution but allows others to reuse the materials different contexts.

3. Make your data open and share content

Allow others to easily embed or download your content and data. The SpinTX video archive provides social sharing buttons for each video, as well as providing access to the source data (tagged transcripts) through Google Fusion Tables.

4. Embrace open source development

When possible, use and build upon open source tools. The SpinTX project was developed using a combination of open source software (e.g. TreeTagger, Drupal) and open APIs (e.g. YouTube Captioning API). Custom code developed for the project is openly shared through a GitHub repository.

5. Make project documentation open

Make it easy for others to replicate and build on your work. The SpinTX team is publishing its research protocols, development processes and methodologies, and other project documentation on the SpinTX Corpus-to-Classroom blog.

Openly sharing language corpora may have wide-ranging benefits for diverse communities of researchers, educators, language learners, and the public interest. The SpinTX team is interested in starting a conversation across these communities. Have you ever used a corpus before? What did you use it for? If you have never used a corpus, how do you find and use authentic videos in the classroom?  How can we make video corpora more accessible and useful for teachers and learners?

—

gilgRachael Gilg is the Project Manager and Lead Developer for COERLL’s Spanish in Texas Corpus project and the SpinTX Corpus-to-Classroom project. She has acted as project manager, designer, and developer on a diverse set of projects, including educational websites and online courses, video and interactive media, digital archives, and social/community websites.

Filed Under: Instructional Materials, Methods/Open educational practices (OEP), Open education philosophy, Publishing OER, Spanish Tagged With: COERLL, OER, open data, Open education, Open research, open source software, Spanish, Spanish in Texas, Spanish language learning, Spanish video

Comments

  1. Carl Blyth says

    May 15, 2013 at 9:21 pm

    Great post, Rachael. And what a fantastic OER! Kudos to the SpinTX team. I can’t wait for Spanish teachers to discover this amazing resource.

    I’d like to mention that COERLL will be presenting SpinTX at several conferences this summer (CALICO in Honolulu, AATSP in San Antonio) as well as a free webinar on June 26th. Register for the June webinar by going here:
    http://coerll.utexas.edu/coerll/event/june-webinar-series-focus-spintx

    Btw, I want to draw readers’ attention to two Canadian video archives for learning French:
    1. “Francotoile” from the University of Victoria (BC)
    http://francotoile.uvic.ca/

    2. ‘Vidéotech” (Carleton University, Ontario)
    https://video-tech.ca/

    Francotoile presents French as an international language with clips of native speakers from 5 continents. You can easily download all videos to your hard drive!
    Vidéotech contains NS videos and NNS videos. It also allows you to create activities based on easy-to-use templates.

    Both resources feature many of the elements that Rachael cites for “opening up foreign language learning,” including Creative Commons licenses.

    Reply
  2. Andrew Weiler says

    May 19, 2013 at 10:00 am

    Seems like a worthy endeavour. I would like to add that it is important to be mindful of the learning process and let that drive any developments. Too often developments in this field lose track of the fundamental of what is required for a language learner to learn. One practical suggestion is to consider game theory ( what drives the phenomenal addictive power of video games) as a way to integrate what we know about successful use of this medium into learning

    Reply
    • Rachael says

      May 20, 2013 at 1:53 pm

      Excellent point Andrew! Since educators and language learners are our primary stakeholders, their needs should drive the design process. We started our project by interviewing educators and developing a needs assessment that informed our design. In addition, we are trying to follow a lean startup approach, which means getting a simple version of the tool launched early and adding new features incrementally based on user observation and feedback.

      Reply
If you are a first-time commenter, your comment will be held for moderation.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Find
Posts

Search by keywords

Popular categories

  • Badges
  • COERLL updates
  • Finding OER
  • Hybrid learning
  • Instructional Materials
  • Language Skills
  • Methods/Open educational practices (OEP)
  • MOOCs
  • OER initiatives
  • OER Research
  • Open education philosophy
  • Publishing OER
  • Remixing OER
  • Spanish
  • Teacher Development
  • Technology-based language learning

Popular posts

  • 10 French Resources for Students Anywhere
  • BOLDD: At the Speed of Language
  • Re-Mixxer: Using French and German OER in The Mixxer
  • "We're Committed to Openness in Content Creation"
  • Emerging Leader Creates Language Learning OER

Follow this
Blog

RSS Blog Feed
COERLL Facebook
COERLL Twitter

Subscribe2


 

Events

Upcoming COERLL events

  • Summer Workshop | Games2Teach collaboratory
  • Spanish Heritage Language Summer 2021 Workshop
  • SHL Hangout | Creating, adapting, sharing: OER in SHL instruction
  • Partner event | Game on! Cool Apps for Language Learning

Join a Community of Open Language Educators!

  • Language OER Network
  • Heritage Spanish Community
  • Deutsch im Blick Facebook
  • Brazilpod Facebook

Guest blog entries

Submit a Topic

Awarded Top 100 Language Learning Blog

Awarded Top 100 Language Learning Blog

Creative Commons License · COERLL · University of Texas at Austin

We use cookies to enhance your browsing experience. By using this site you are accepting the use of such cookies.