There are several resources that are based on the older
version of the Corpus del Espaņol (which was released in 2001), such as:
The older Corpus del Espaņol was quite small,
however (only 20 million words for the 1900s). As a result, there were many
types of resources that we've created for English, which couldn't be created for
Spanish until a much larger corpus was available. With the new two billion word
corpus, we can create many of these resources. They will include:
-
Full-text data, which means that you'd have nearly the entire two
billion words of data on your machine
-
Updated data similar to the
word frequency,
collocates, and
n-grams data (including the top 40,000
lemmas of Spanish)
-
WordAndPhrase for Spanish, which allows you to browse through the
top 40,000 lemmas to see frequency information, definition, collocates,
concordances, and synonyms -- all on one page. In addition, you can input your own texts and analyze them with the corpus data.
|
|