PELIC

The University of Pittsburgh English Language Institute Corpus


Logo

 

 

PELIC Dataset repository on GitHub

 

Contact: Ben Naismith (bnaismith@pitt.edu)

The University of Pittsburgh English Language Institute Corpus (PELIC)

Welcome to the homepage for the The University of Pittsburgh English Language Institute Corpus (PELIC). Here you will find information about the people involved with PELIC and the research using these data.

If you are interested in learning more about PELIC or downloading the dataset, please visit the PELIC-dataset GitHub repository which contains the publicly available data, a detailed description of the corpus, frequency statistics, and a set of lexical tools and tutorials.


Table of contents

  1. PELIC overview
  2. People
  3. Acknowledgments
  4. Selected publications and presentations based on PELIC data
  5. Recent and current projects
  6. Other research based on ELI students
  7. PELIC used elsewhere


1. PELIC overview

The University of Pittsburgh English Language Institute Corpus (PELIC) is a large learner corpus of written and spoken texts. These texts were collected in an English for Academic Purposes (EAP) context over seven years in the University of Pittsburgh’s Intensive English Program, and were produced by students with a wide range of linguistic backgrounds and proficiency levels. PELIC is longitudinal, offering greater opportunities for tracking development in a natural classroom setting.

This webpage provides information about the people involved with and the research resulting from PELIC. Where possible, PDFs of the research are provided.


2. People

ELI students and teachers

First and foremost, we wish to thank and acknowledge the students of the ELI for their spoken and written contributions to PELIC, and for graciously allowing us to publicly share this data. We hope that the research stemming from their work will improve the quality of learning and teaching for other students in similar contexts who are striving to attain academic readiness.

We also wish to thank the teachers and administrators of the ELI for their assistance in diligently collecting the data from the students.


ELI Data Mining Group

The ELI Data Mining Group is a research group in the Department of Linguistics at the University of Pittsburgh. Their focus is on applying computational methods to the PELIC dataset.

Current ELI Data Mining Group:

ELI Data Mining Group associate members:


3. Acknowledgments

We would like to thank the National Science Foundation for their grant via the Pittsburgh Science of Learning Center, funded award number SBE-0836012. (Previously NSF award number SBE-0354420.)

There have been numerous people involved throughout the process of compiling and creating PELIC, assisting with data collection, coding, etc. In particular, we wish to acknowledge the following programmers for their important roles: Ben Madore, Shanwen Yu, and Michael Nugent.


4. Selected publications and presentations based on PELIC data


5. Recent and current projects

We are in the initial stages of preparing the spoken PELIC data for public release.


6. Other research based on ELI students

In addition to publications based on PELIC data, a number of studies and datasets have been published based on other data from this same population, i.e., students at the ELI:

Papers based on the Reading Tutor REAP in the English Language Institute

Spoken data posted online by Vercellotti (2016) and available for download and analysis in CLAN

Published and unpublished MA theses and PhD dissertations written with support from LearnLab.org and the English Language Institute


7. PELIC used elsewhere

PELIC data can also be found in the following locations:


Back to top