Open Up

Conversations on Open Education for Language Learning

  • Home
  • About
  • Archive
  • COERLL website
    • About COERLL
    • COERLL Projects
    • COERLL Publications
    • COERLL Materials
    • COERLL Events
    • OER
      • OER Glossary
      • OER Intitiaves
      • COERLL OER Wiki
  • Donate to COERLL
  • Submissions
  • Contact Us

5 Ways to Open Up Corpora for Language Learning

By Rachael Gilg

May 15, 2013 3 Comments

Corpora developed by linguists to study languages are a promising source of authentic materials to employ in the development of OER for language learning. Recently, COERLL’s SpinTX Corpus-to-Classroom project launched a new open resource that seeks to make it easy to search and adapt materials from a video corpus.

The SpinTX video archive  provides a pedagogically-friendly web interface to search hundreds of videos from the Spanish in Texas Corpus. Each of the videos is accompanied by synchronized closed captions and a transcript that has been annotated with thematic, grammatical, functional and metalinguistic information. Educators using the site can also tag videos for features that match their interests, and share favorite videos in playlists.

A collaboration among educators, professional linguists, and technologists, the SpinTX project leverages different aspects of the “openness” movement including open research, open data, open source software, and open education. It is our hope that by opening up this corpus, and by sharing the strategies and tools we used to develop it, others may be able to replicate and build on our work in other contexts.

So, how do we make a corpus open and beneficial across communities? Here are 5 ways:

1. Create an open and accessible search interface

Minimize barriers to your content. Searching the SpinTX video archive requires no registration, passwords or fees. To maximize accessibility, think about your audience’s context and needs. The SpinTX video archive offers a corpus interface specifically for educators, and plans to to create a different interface for researchers.

2. Use open content licences

Add a Creative Commons license to your corpus materials. The SpinTX video archive uses a CC BY-NC-SA license that requires attribution but allows others to reuse the materials different contexts.

3. Make your data open and share content

Allow others to easily embed or download your content and data. The SpinTX video archive provides social sharing buttons for each video, as well as providing access to the source data (tagged transcripts) through Google Fusion Tables.

4. Embrace open source development

When possible, use and build upon open source tools. The SpinTX project was developed using a combination of open source software (e.g. TreeTagger, Drupal) and open APIs (e.g. YouTube Captioning API). Custom code developed for the project is openly shared through a GitHub repository.

5. Make project documentation open

Make it easy for others to replicate and build on your work. The SpinTX team is publishing its research protocols, development processes and methodologies, and other project documentation on the SpinTX Corpus-to-Classroom blog.

Openly sharing language corpora may have wide-ranging benefits for diverse communities of researchers, educators, language learners, and the public interest. The SpinTX team is interested in starting a conversation across these communities. Have you ever used a corpus before? What did you use it for? If you have never used a corpus, how do you find and use authentic videos in the classroom?  How can we make video corpora more accessible and useful for teachers and learners?

—

gilgRachael Gilg is the Project Manager and Lead Developer for COERLL’s Spanish in Texas Corpus project and the SpinTX Corpus-to-Classroom project. She has acted as project manager, designer, and developer on a diverse set of projects, including educational websites and online courses, video and interactive media, digital archives, and social/community websites.

Filed Under: Instructional Materials, Methods/Open educational practices (OEP), Open education philosophy, Publishing OER, Spanish Tagged With: COERLL, OER, open data, Open education, Open research, open source software, Spanish, Spanish in Texas, Spanish language learning, Spanish video

Find
Posts

Search by keywords

Popular categories

  • Badges
  • COERLL updates
  • Critical language study
  • Finding OER
  • Hybrid learning
  • Instructional Materials
  • Language Skills
  • Methods/Open educational practices (OEP)
  • MOOCs
  • OER initiatives
  • OER Research
  • Open education philosophy
  • Publishing OER
  • Remixing OER
  • Spanish
  • Teacher Development
  • Technology-based language learning
  • Uncategorized

Popular posts

  • 10 French Resources for Students Anywhere
  • BOLDD: At the Speed of Language
  • Re-Mixxer: Using French and German OER in The Mixxer
  • "We're Committed to Openness in Content Creation"
  • Activities for remote language teaching

Follow this
Blog

RSS Blog Feed
COERLL Facebook
COERLL Twitter

Subscribe2


 

Events

Upcoming COERLL events

    Join a Community of Open Language Educators!

    • Language OER Network
    • Heritage Spanish Community
    • Deutsch im Blick Facebook
    • Brazilpod Facebook

    Guest blog entries

    Submit a Topic

    Awarded Top 100 Language Learning Blog

    Awarded Top 100 Language Learning Blog

    Creative Commons License · COERLL · University of Texas at Austin

    We use cookies and external scripts to enhance your experience. By using this site you are accepting the use of such cookies.

    Privacy settings

    General cookie information

    This site uses cookies – small text files that are placed on your machine to help the site provide a better user experience. In general, cookies are used to retain user preferences, store information for things like shopping carts, and provide anonymized tracking data to third party applications like Google Analytics. As a rule, cookies will make your browsing experience better. However, you may prefer to disable cookies on this site and on others. The most effective way to do this is to disable cookies in your browser. We suggest consulting the Help section of your browser or taking a look at the About Cookies website which offers guidance for all modern browsers.

    Which cookies and scripts are used and how they impact your visit is specified on the left. You may change your settings at any time. Your choices will not impact your visit.

    Read the entire privacy policy.

    NOTE: These settings will only apply to the browser and device you are currently using.

    Google Analytics cookies

    We use cookies to analyze our website traffic. We also share information about your use of our site with our analytics partners who may combine it with other information that you’ve provided to them or that they’ve collected from use of their services. Cookies are alphanumeric identifiers that we transfer to your computer’s hard drive through your web browser. They make it possible for us to store your navigation habits, recognize your browser when you visit. It is possible to prevent cookies from being used in your browser by turning the feature off, but in order to make your experience better, your browser must be set to accept cookies.

    Powered by Cookie Information