seventh framework
  • Castellano
  • Français
  • English
  • Deutsch
  • Italiano
  • Ελληνικά

PANACEA Environment English Monolingual Corpus

This data set is the English part of the second version of the monolingual corpus (MCv2) acquired in the framework of PANACEA, an EU-FP7 Funded Project under Grant Agreement 248064. The data set contains documents that were acquired from the web, were automatically detected to be in the English language and were automatically classified as relevant to the "Environment" (ENV) domain.

An N-gram list version of this corpus is also available.

Size information:

  • tokens: 50,541,538

  • Download location

    DISCLAIMER: The right to use the sentences contained in this data set has been granted by their copyright holders. This usage is exclusive for research purposes and no profit can be made out of it. We are grateful to all sources for their kind and generous contribution.

    For further information on these sources, please see: Acknowledgements

    This resource is distributed under the following licence: CC-BY-NC-SA