el corpus del espaņol

el corpus del espaņol


There are several resources that are based on the older version of the Corpus del Espaņol (which was released in 2001), such as:

The older Corpus del Espaņol was quite small, however (only 20 million words for the 1900s). As a result, there were many types of resources that we've created for English, which couldn't be created for Spanish until a much larger corpus was available. With the new two billion word corpus, we can create many of these resources. They will include:

  • Full-text data, which means that you'd have nearly the entire two billion words of data on your machine

  • Updated data similar to the word frequency, collocates, and n-grams data (including the top 40,000 lemmas of Spanish)

  • WordAndPhrase for Spanish, which allows you to browse through the top 40,000 lemmas to see frequency information, definition, collocates, concordances, and synonyms -- all on one page. In addition, you can input your own texts and analyze them with the corpus data.