Skip to contents

The Europarl Parallel Corpus is extracted from the proceedings of the European Parliament. This corpus is a sample from the Spanish-English pair.

Usage

data("europarle_sample")

Format

A data frame with 200,000 observations on the following 3 variables.

type

Either: Source or Target language

sentence_id

Id to index the sentence pairs

sentence

Each line from the proceedings, including comments

Details

Version 7 release.

References

Koehn, P. (2005, September). Europarl: A parallel corpus for statistical machine translation. In MT summit (Vol. 5, pp. 79-86).

Examples

data(europarle_sample)