LDC Corpora

We are a Linguistic Data Consortium (LDC) member for the following years: 1993-1996, 1999-2001, 2006, 2009-2010. LDC corpora are available to members of the Rutgers University, for non-commercial education, research and technology development purposes only.

The table below shows the corpora that we have available. To download them, please visit our LDC Corpora Download Information Form page. When you click the link, you will be prompted to log in. Please log in with your NetID to verify your Rutgers University affiliation, and then proceed to receive download instruction.

You can sort the table below by clicking the column header. Clicking the same header again will change the sort order (ascending/descending). Please click the Catalog ID link to view more information about the corpus on the LDC website.

Please contact us at salts-admin [at] rutgers.edu if you have any questions, comments, or feedback.