Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement
Hassan Sajjad, Francisco Guzmán, Nadir Durrani, Ahmed Abdelali, Houda Bouamor, Irina Temnikova, and Stephan Vogel.
In
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
, pages 1082-1088,
2016. PDFAbstract
Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement
In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task: the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.
Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement
@InProceedings{sajjad-EtAl:2016:N16-1,
author = {Sajjad, Hassan and Guzm\'{a}n, Francisco and Durrani, Nadir and Abdelali, Ahmed and Bouamor, Houda and Temnikova, Irina and Vogel, Stephan},
title = {Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement},
booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
year = {2016},
address = {San Diego, California},
publisher = {Association for Computational Linguistics},
pages = {1082--1088},
url = {http://www.aclweb.org/anthology/N16-1125}
}
iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking
@inproceedings{abdelali-durrani-guzman:2016:N16-3,
address = {San Diego, California},
author = {Abdelali, Ahmed and Durrani, Nadir and Guzm\'{a}n, Francisco},
booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations},
link = {http://www.aclweb.org/anthology/N16-3004},
month = {June},
pages = {17--21},
publisher = {Association for Computational Linguistics},
title = {iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking},
year = {2016}
}
Machine Translation Evaluation Meets Community Question Answering
@inproceedings{guzman-marquez-nakov:2016:P16-2,
address = {Berlin, Germany},
author = {Guzm\'{a}n, Francisco and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics},
link = {http://anthology.aclweb.org/P16-2075},
month = {August},
pages = {460--466},
publisher = {Association for Computational Linguistics},
title = {Machine Translation Evaluation Meets Community Question Answering},
year = {2016}
}
MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering?
@inproceedings{guzman-nakov-marquez:2016:SemEval,
address = {San Diego, California},
author = {Guzm\'{a}n, Francisco and Nakov, Preslav and M\`{a}rquez, Llu\'{i}s},
booktitle = {Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)},
link = {http://www.aclweb.org/anthology/S16-1137},
month = {June},
pages = {887--895},
publisher = {Association for Computational Linguistics},
title = {MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering?},
year = {2016}
}
It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering
@inproceedings{nakov-marquez-guzman:2016:EMNLP2016,
address = {Austin, Texas},
author = {Nakov, Preslav and M\`{a}rquez, Llu\'{i}s and Guzm\'{a}n, Francisco},
booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
link = {https://aclweb.org/anthology/D16-1165},
month = {November},
pages = {1586--1597},
publisher = {Association for Computational Linguistics},
title = {It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering},
year = {2016}
}
Normalizing Mathematical Expressions to Improve the Translation of Educational Content
@inproceedings{zaghouani-EtAl:2016:SeMaT,
address = {Austin, Texas},
author = {Zaghouani, Wajdi and Abdelali, Ahmed and Guzm\'{a}n, Francisco and Sajjad, Hassan},
booktitle = {Proceedings of the Workshop on Semitic Machine Translation},
link = {http://www.aclweb.org/anthology/W/W05/W05-0204},
month = {November},
pages = {20--27},
publisher = {Association for Computational Linguistics},
title = {Normalizing Mathematical Expressions to Improve the Translation of Educational Content},
year = {2016}
}
Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings
Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.
Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings
@inproceedings{guzman-EtAl:2016:COLING,
abstract = {Evaluation of machine translation (MT) into morphologically rich languages
(MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.},
address = {Osaka, Japan},
author = {Guzm\'{a}n, Francisco and Bouamor, Houda and Baly, Ramy and Habash, Nizar},
booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers},
link = {http://aclweb.org/anthology/C16-1132},
month = {December},
pages = {1398--1408},
publisher = {The COLING 2016 Organizing Committee},
title = {Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings},
year = {2016}
}
Machine translation evaluation with neural networks
Abstract We present a framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is embedded into compact distributed vector representations, and fed into a multi-layer neural network that models nonlinear interactions between each of the hypotheses and the reference, as well as between the two hypotheses. We experiment with the benchmark datasets from the \WMT\ Metrics shared task, on which we obtain the best results published so far, with the basic network configuration. We also perform a series of experiments to analyze and understand the contribution of the different components of the network. We evaluate variants and extensions, including fine-tuning of the semantic embeddings, and sentence-based representations modeled with convolutional and recurrent neural networks. In summary, the proposed framework is flexible and generalizable, allows for efficient learning and scoring, and provides an \MT\ evaluation metric that correlates with human judgments, and is on par with the state of the art.
Machine translation evaluation with neural networks
@article{GuzmanCSL2016, abstract = {Abstract We present a framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation.
In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses
is embedded into compact distributed vector representations, and fed into a multi-layer neural network that
models nonlinear interactions between each of the hypotheses and the reference, as well as between the two hypotheses.
We experiment with the benchmark datasets from the \{WMT\} Metrics shared task, on which we obtain the best results published so far, with the basic network configuration.
We also perform a series of experiments to analyze and understand the contribution of the different components of the network.
We evaluate variants and extensions, including fine-tuning of the semantic embeddings, and sentence-based representations modeled with
convolutional and recurrent neural networks. In summary, the proposed framework is flexible and generalizable, allows for efficient learning and scoring, and provides an \{MT\} evaluation metric that correlates with human judgments, and is on par with the state of the art.},
author = {Guzm\'{a}n,Francisco and Joty, Shafiq and Màrquez,Lluís and Nakov, Preslav}, doi = {http://dx.doi.org/10.1016/j.csl.2016.12.005}, issn = {0885-2308},
journal = {Computer Speech & Language},
link = {http://www.sciencedirect.com/science/article/pii/S0885230816301693},
title = {Machine translation evaluation with neural networks},
year = {2016}
}
2015
Pairwise Neural Machine Translation Evaluation
Francisco Guzmán, Shafiq Joty, Lluís Màrquez, and Preslav Nakov.
In
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and The 7th International Joint Conference of the Asian Federation of Natural Language Processing
, pages 805-814,
2015. PDFAbstract
Pairwise Neural Machine Translation Evaluation
We present a novel framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is compacted into relatively small distributed vector representations, and fed into a multi-layer neural network that models the interaction between each of the hypotheses and the reference, as well as between the two hypotheses. These compact representations are in turn based on word and sentence embeddings, which are learned using neural networks. The framework is flexible, allows for efficient learning and classification, and yields correlation with humans that rivals the state of the art.
@InProceedings{guzman2015-ACL, author = {Guzm\'{a}n, Francisco and Joty, Shafiq and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav }, title = {Pairwise Neural Machine Translation Evaluation},
booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and The 7th International Joint Conference of the Asian
Federation of Natural Language Processing ({ACL}'15)},
month = {July},
year = {2015},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {805--814},
url = {http://www.aclweb.org/anthology/P15-1078},
Abstract = {We present a novel framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is compacted into relatively small distributed vector representations, and fed into a multi-layer neural network that models the interaction between each of the hypotheses and the reference, as well as between the two hypotheses. These compact representations are in turn based on word and sentence embeddings, which are learned using neural networks. The framework is flexible, allows for efficient learning and classification, and yields correlation with humans that rivals the state of the art.}
}
The QCN Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT’2015
The paper describes the Egyptian Arabic-to-English statistical machine translation (SMT) system that the QCRI-Columbia-NYUAD (QCN) group submitted to the NIST OpenMT'2015 competition. The competition focused on informal dialectal Arabic, as used in SMS, chat, and speech. Thus, our efforts focused on processing and standardizing Arabic, e.g., using tools such as 3arrib and MADAMIRA. We further trained a phrase-based SMT system using state-of-the-art features and components such as operation sequence model, class-based language model, sparse features, neural network joint model, genre-based hierarchically-interpolated language model, unsupervised transliteration mining, phrase-table merging, and hypothesis combination. Our system ranked second on all three genres.
The QCN Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT’2015
@Proceedings{sajjad2015-NIST,
title={ The QCN Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT’2015},
author={Sajjad, Hassan and Durrani, Nadir and Guzmán, Francisco and Nakov, Preslav and Abdelali, Ahmed and Vogel, Stephan and Salloum, Wael and El Kholy, Ahmed and Habash, Nizar},
year={2015},
abstract={The paper describes the Egyptian Arabic-to-English statistical machine translation (SMT) system that the QCRI-Columbia-NYUAD (QCN) group submitted to the NIST OpenMT'2015 competition. The competition focused on informal dialectal Arabic, as used in SMS, chat, and speech. Thus, our efforts focused on processing and standardizing Arabic, e.g., using tools such as 3arrib and MADAMIRA. We further trained a phrase-based SMT system using state-of-the-art features and components such as operation sequence model, class-based language model, sparse features, neural network joint model, genre-based hierarchically-interpolated language model, unsupervised transliteration mining, phrase-table merging, and hypothesis combination. Our system ranked second on all three genres.},
booktitle={Proceedings of the NIST Open Machine Translation Evaluation Workshop},
publisher={NIST}
}
QAT2 – The QCRI Advanced Transcription and Translation System
QAT2 is a multimedia content translation web service developed by QCRI to help content provider to reach audiences and viewers speaking different languages. It is built with open source technologies such as KALDI, Moses and MaryTTS, to provide a complete translation experience for web users. It translates text content in its original format, and produce translated videos with speech-to-speech translation. The result is a complete native language experience for end users on foreign language websites. The system currently supports translation from Arabic to English.
QAT2 – The QCRI Advanced Transcription and Translation System
@inproceedings{abdelali2015-INTERSPEECH,
title={{QAT}$^2$--The {QCRI} Advanced Transcription and Translation System},
author={Abdelali, Ahmed and Ali, Ahmed and Guzm{\'a}n, Francisco and Stahlberg, Felix and Vogel, Stephan and Zhang, Yifan},
booktitle={ Proceedings of the 16th Annual Conference of the International Speech Communication Association},
year={2015},
abstract={QAT2 is a multimedia content translation web service developed by QCRI to help content provider to reach audiences and viewers speaking different languages. It is built with open source technologies such as KALDI, Moses and MaryTTS, to provide a complete translation experience for web users. It translates text content in its original format, and pro- duce translated videos with speech-to-speech translation. The result is a complete native language experience for end users on foreign language websites. The system currently supports translation from Arabic to English.}
}
Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length
We study the impact of source length and verbosity of the tuning dataset on the performance of parameter optimizers such as MERT and PRO for statistical machine translation. In particular, we test whether the verbosity of the resulting translations can be modified by varying the length or the verbosity of the tuning sentences. We find that MERT learns the tuning set verbosity very well, while PRO is sensitive to both the verbosity and the length of the source sentences in the tuning set; yet, overall PRO learns best from high-verbosity tuning datasets. Given these dependencies, and potentially some other such as amount of reordering, number of unknown words, syntactic complexity, and evaluation measure, to mention just a few, we argue for the need of controlled evaluation scenarios, so that the selection of tuning set and optimization strategy does not overshadow scientific advances in modeling or decoding. In the mean time, until we develop such controlled scenarios, we recommend using PRO with a large verbosity tuning set, which, in our experiments, yields highest BLEU across datasets and language pairs.
Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length
@InProceedings{guzman-nakov-vogel:2015:CoNLL, author = {Guzm\'{a}n, Francisco and Nakov, Preslav and Vogel, Stephan}, title = {Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length}, booktitle = {Proceedings of the Nineteenth Conference on Computational Natural Language Learning}, month = {July}, year = {2015}, address = {Beijing, China}, publisher = {Association for Computational Linguistics}, pages = {62--72}, url = {http://www.aclweb.org/anthology/K15-1007}, abstract = {We study the impact of source length and verbosity of the tuning dataset on the performance of parameter optimizers such as MERT and PRO for statistical machine translation. In particular, we test whether the verbosity of the resulting translations can be modified by varying the length or the verbosity of the tuning sentences. We find that MERT learns the tuning set verbosity very well, while PRO is sensitive to both the verbosity and the length of the source sentences in the tuning set; yet, overall PRO learns best from high-verbosity tuning datasets.
Given these dependencies, and potentially some other such as amount of reordering, number of unknown words, syntactic complexity, and evaluation measure, to mention just a few, we argue for the need of controlled evaluation scenarios, so that the selection of tuning set and optimization strategy does not overshadow scientific advances in modeling or decoding. In the mean time, until we develop such controlled scenarios, we recommend using PRO with a large verbosity tuning set, which, in our experiments, yields highest BLEU across datasets and language pairs.}
}
How do Humans Evaluate Machine Translation
Francisco Guzmán, Ahmed Abdelali, Irina Temnikova, Hassan Sajjad, and Stephan Vogel.
In
Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 457-466,
2015. PDFAbstract
How do Humans Evaluate Machine Translation
In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task: the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.
@inproceedings{guzman-EtAl:2015:WMT,
abstract = {In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task: the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.},
address = {Lisbon, Portugal},
author = {Guzm\'{a}n, Francisco and Abdelali, Ahmed and Temnikova, Irina and Sajjad, Hassan and Vogel, Stephan},
booktitle = {Proceedings of the Tenth Workshop on Statistical Machine Translation},
link = {http://aclweb.org/anthology/W15-3059},
month = {September},
pages = {457--466},
publisher = {Association for Computational Linguistics},
title = {How do Humans Evaluate Machine Translation},
year = {2015}
}
Errata | How do Humans Evaluate Machine Translation
The AMARA Corpus: Building parallel language resources for the educational domain
This paper presents the AMARA corpus of on-line educational content: a new parallel corpus of educational video subtitles, multilingually aligned for 20 languages, i.e. 20 monolingual corpora and 190 parallel corpora. This corpus includes both resource-rich languages such as English and Arabic, and resource-poor languages such as Hindi and Thai. In this paper, we describe the gathering, validation, and preprocessing of a large collection of parallel, community-generated subtitles. Furthermore, we describe the methodology used to prepare the data for Machine Translation tasks. Additionally, we provide a document-level, jointly aligned development and test sets for 14 language pairs, designed for tuning and testing Machine Translation systems. We provide baseline results for these tasks, and highlight some of the challenges we face when building machine translation systems for educational content.
The AMARA Corpus: Building parallel language resources for the educational domain
@inproceedings{abdelali:2014:amara,
abstract = {This paper presents the AMARA corpus of on-line educational content: a new parallel corpus of educational video subtitles, multilingually aligned for 20 languages, i.e. 20 monolingual corpora and 190 parallel corpora. This corpus includes both resource-rich languages such as English and Arabic, and resource-poor languages such as Hindi and Thai. In this paper, we describe the gathering, validation, and preprocessing of a large collection of parallel, community-generated subtitles. Furthermore, we describe the methodology used to prepare the data for Machine Translation tasks. Additionally, we provide a document-level, jointly aligned development and test sets for 14 language pairs, designed for tuning and testing Machine Translation systems. We provide baseline results for these tasks, and highlight some of the challenges we face when building machine translation systems for educational content.},
address = {Reykjavik, Iceland},
author = {Abdelali, Ahmed and Guzm{\'a}n, Francisco and Sajjad, Hassan and Vogel, Stephan},
booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation ({LREC}'14)},
month = {april},
title = {The {AMARA} Corpus: Building parallel language resources for the educational domain},
year = {2014}
}
Amara: A Sustainable, Global Solution for Accessibility, Powered by Communities of Volunteers
In this paper, we present the main features of the Amara project, and its impact on the accessibility landscape with the use of innovative technology. We also show the effectiveness of volunteer communities in addressing large subtitling and translation tasks, that accompany the ever-growing amounts of online video content. Furthermore, we present two different applications for the platform. First, we examine the growing interest of organizations to build their own subtitling communities. Second, we present how the community-generated material can be used to advance the state-of-the-art of research in fields such as Statistical Machine Translation with focus on educational translation. We provide examples on how both tasks can be achieved successfully.
Amara: A Sustainable, Global Solution for Accessibility, Powered by Communities of Volunteers
@inproceedings{jansen:2014:amara,
abstract = {In this paper, we present the main features of the Amara project, and its impact on the accessibility landscape with the use of innovative technology. We also show the effectiveness of volunteer communities in addressing large subtitling and translation tasks, that accompany the ever-growing amounts of online video content. Furthermore, we present two different applications for the platform. First, we examine the growing interest of organizations to build their own subtitling communities. Second, we present how the community-generated material can be used to advance the state-of-the-art of research in fields such as Statistical Machine Translation with focus on educational translation. We provide examples on how both tasks can be achieved successfully.},
address = {Heraklion, Greece},
author = {Jansen, Dean and Alcala, Aleli and Guzm{\'a}n, Francisco},
booktitle = {Universal Access in Human-Computer Interaction. Design for All and Accessibility Practice},
month = {June},
pages = {401--411},
publisher = {Springer International Publishing},
title = {Amara: A Sustainable, Global Solution for Accessibility, Powered by Communities of Volunteers},
year = {2014}
}
Using Discourse Structure Improves Machine Translation Evaluation
We present experiments in using discourse structure for improving machine translation evaluation. We first design two discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory. Then, we show that these measures can help improve a number of existing machine translation evaluation metrics both at the segmentand at the system-level. Rather than proposing a single new metric, we show that discourse information is complementary to the state-of-the-art evaluation metrics, and thus should be taken into account in the development of future richer evaluation metrics.
Using Discourse Structure Improves Machine Translation Evaluation
@inproceedings{guzman-EtAl:2014:P14-1,
abstract = {We present experiments in using discourse structure for improving machine translation evaluation. We first design two discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory. Then, we show that these measures can help improve a number of existing machine translation evaluation metrics both at the segmentand at the system-level. Rather than proposing a single new metric, we show that discourse information is complementary to the state-of-the-art evaluation metrics, and thus should be taken into account in the development of future richer evaluation metrics.},
address = {Baltimore, Maryland, USA},
author = {Guzm\'{a}n, Francisco and Joty, Shafiq and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics ({ACL}'14)},
link = {http://www.aclweb.org/anthology/P/P14/P14-1065},
month = {June},
pages = {687--698},
publisher = {Association for Computational Linguistics},
title = {Using Discourse Structure Improves Machine Translation Evaluation},
year = {2014}
}
DiscoTK: Using Discourse Structure for Machine Translation Evaluation
We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference. We experiment with five transformations and augmentations of a base discourse tree representation based on the rhetorical structure theory, and we combine the kernel scores for each of them into a single score. Finally, we add other metrics from the ASIYA MT evaluation toolkit, and we tune the weights of the combination on actual human judgments. Experiments on the WMT12 and WMT13 metrics shared task datasets show correlation with human judgments that outperforms what the best systems that participated in these years achieved, both at the segment and at the system level.
DiscoTK: Using Discourse Structure for Machine Translation Evaluation
@inproceedings{joty-EtAl:2014:W14-33,
abstract = {We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference. We experiment with five transformations and augmentations of a base discourse tree representation based on the rhetorical structure theory, and we combine the kernel scores for each of them into a single score. Finally, we add other metrics from the ASIYA MT evaluation toolkit, and we tune the weights of the combination on actual human judgments. Experiments on the WMT12 and WMT13 metrics shared task datasets show correlation with human judgments that outperforms what the best systems that participated in these years achieved, both at the segment and at the system level.},
address = {Baltimore, Maryland, USA},
author = {Joty, Shafiq and Guzm\'{a}n, Francisco and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav},
booktitle = {Proceedings of the Ninth Workshop on Statistical Machine Translation ({WMT}'14)},
link = {http://www.aclweb.org/anthology/W/W14/W14-3352},
month = {June},
pages = {402--408},
publisher = {Association for Computational Linguistics},
title = {DiscoTK: Using Discourse Structure for Machine Translation Evaluation},
year = {2014}
}
Learning to Differentiate Better from Worse Translations
Francisco Guzmán, Shafiq Joty, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov, and Massimo Nicosia.
In
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 214-220,
2014. PDFAbstract
Learning to Differentiate Better from Worse Translations
We present a pairwise learning-to-rank approach to machine translation evaluation that learns to differentiate better from worse translations in the context of a given reference. We integrate several layers of linguistic information encapsulated in tree-based structures, making use of both the reference and the system output simultaneously, thus bringing our ranking closer to how humans evaluate translations. Most importantly, instead of deciding upfront which types of features are important, we use the learning framework of preference re-ranking kernels to learn the features automatically. The evaluation results show that learning in the proposed framework yields better correlation with humans than computing the direct similarity over the same type of structures. Also, we show our structural kernel learning (SKL) can be a general framework for MT evaluation, in which syntactic and semantic information can be naturally incorporated.
Learning to Differentiate Better from Worse Translations
@inproceedings{guzman-EtAl:2014:EMNLP2014,
abstract = {We present a pairwise learning-to-rank approach to machine translation evaluation that learns to differentiate better from worse translations in the context of a given reference. We integrate several layers of linguistic information encapsulated in tree-based structures, making use of both the reference and the system output simultaneously, thus bringing our ranking closer to how humans evaluate translations. Most importantly, instead of deciding upfront which types of features are important, we use the learning framework of preference re-ranking kernels to learn the features automatically. The evaluation results show that learning in the proposed framework yields better correlation with humans than computing the direct similarity over the same type of structures. Also, we show our structural kernel learning (SKL) can be a general framework for MT evaluation, in which syntactic and semantic information can be naturally incorporated.},
address = {Doha, Qatar},
author = {Guzm\'{a}n, Francisco and Joty, Shafiq and M\`{a}rquez, Llu\'{i}s and Moschitti, Alessandro and Nakov, Preslav and Nicosia, Massimo},
booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP'14)},
link = {http://www.aclweb.org/anthology/D14-1027},
month = {October},
pages = {214--220},
publisher = {Association for Computational Linguistics},
title = {Learning to Differentiate Better from Worse Translations},
year = {2014}
}
A Tale about PRO and Monsters
Preslav Nakov, Francisco Guzmán, and Stephan Vogel.
In
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL'13), pages 12-17,
2013. PDFAbstract
A Tale about PRO and Monsters
While experimenting with tuning on long sentences, we made an unexpected discovery: that PRO falls victim to monsters -overly long negative examples with very low BLEU+1 scores, which are unsuitable for learning and can cause testing BLEU to drop by several points absolute. We propose several effective ways to address the problem, using length- and BLEU+1- based cut-offs, outlier filters, stochastic sampling, and random acceptance. The best of these fixes not only slay and protect against monsters, but also yield higher stability for PRO as well as improved test-time BLEU scores. Thus, we recommend them to anybody using PRO, monster-believer or not.
@inproceedings{nakov-guzman-vogel:2013:Short,
abstract = {While experimenting with tuning on long sentences, we made an unexpected discovery: that PRO falls victim to monsters -overly long negative examples with very low BLEU+1 scores, which are unsuitable for learning and can cause testing BLEU to drop by several points absolute. We propose several effective ways to address the problem, using length- and BLEU+1- based cut-offs, outlier filters, stochastic sampling, and random acceptance. The best of these fixes not only slay and protect against monsters, but also yield higher stability for PRO as well as improved test-time BLEU scores. Thus, we recommend them to anybody using PRO, monster-believer or not.},
address = {Sofia, Bulgaria},
author = {Nakov, Preslav and Guzm{\'a}n, Francisco and Vogel, Stephan},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics ({ACL'13})},
link = {http://www.aclweb.org/anthology/P13-2003},
month = {August},
pages = {12--17},
title = {A Tale about {PRO} and Monsters},
year = {2013}
}
Parameter Optimization for Statistical Machine Translation: It Pays to Learn from Hard Examples
Research on statistical machine translation has focused on particular translation directions, typically with English as the target language, e.g., from Arabic to English. When we reverse the translation direction, the multiple reference translations turn into multiple possible inputs, which offers both challenges and opportunities. We propose and evaluate several strategies for making use of these multiple inputs: (a) select one of the datasets, (b) select the best input for each sentence, and (c) synthesize an input for each sentence by fusing the available inputs. Surprisingly, we find out that it is best to tune on the hardest available input, not on the one that yields the highest BLEU score. This finding has implications on how to pick good translators and how to select useful data for parameter optimization in SMT.
Parameter Optimization for Statistical Machine Translation: It Pays to Learn from Hard Examples
@inproceedings{nakov:2013:parameter,
abstract = {Research on statistical machine translation has focused on particular translation directions, typically with English as the target language, e.g., from Arabic to English. When we reverse the translation direction, the multiple reference translations turn into multiple possible inputs, which offers both challenges and opportunities. We propose and evaluate several strategies for making use of these multiple inputs: (a) select one of the datasets, (b) select the best input for each sentence, and (c) synthesize an input for each sentence by fusing the available inputs. Surprisingly, we find out that it is best to tune on the hardest available input, not on the one that yields the highest BLEU score. This finding has implications on how to pick good translators and how to select useful data for parameter optimization in SMT.},
author = {Nakov, Preslav and Al Obaidli, Fahad and Guzm{\'a}n, Francisco and Vogel, Stephan},
booktitle = {Proceedings of the International Conference Recent Advances in Natural Language Processing ({RANLP}'13)},
month = {September},
title = {Parameter Optimization for Statistical Machine Translation: It Pays to Learn from Hard Examples},
year = {2013}
}
QCRI at IWSLT 2013: Experiments in Arabic-English and English-Arabic Spoken Language Translation
We describe the Arabic-English and English-Arabic statistical machine translation systems developed by the Qatar Computing Research Institute for the IWSLT’2013 evaluation campaign on spoken language translation. We used one phrase-based and two hierarchical decoders, exploring various settings thereof. We further experimented with three domain adaptation methods, and with various Arabic word segmentation schemes. Combining the output of several systems yielded a gain of up to 3.4 BLEU points over the baseline. Here we also describe a specialized normalization scheme for evaluating Arabic output, which was adopted for the IWSLT’2013 evaluation campaign.
QCRI at IWSLT 2013: Experiments in Arabic-English and English-Arabic Spoken Language Translation
@inproceedings{sajjad:2013:qcri,
abstract = {We describe the Arabic-English and English-Arabic statistical machine translation systems developed by the Qatar Computing Research Institute for the IWSLT’2013 evaluation campaign on spoken language translation. We used one phrase-based and two hierarchical decoders, exploring various settings thereof. We further experimented with three domain adaptation methods, and with various Arabic word segmentation schemes. Combining the output of several systems yielded a gain of up to 3.4 BLEU points over the baseline. Here we also describe a specialized normalization scheme for evaluating Arabic output, which was adopted for the IWSLT’2013 evaluation campaign.},
address = {Heidelberg, Germany},
author = {Sajjad, Hassan and Guzm{\'a}n, Francisco and Nakov, Preslav and Abdelali, Ahmed and Murray, Kenton and Al Obaidli, Fahad and Vogel, Stephan},
booktitle = {Proceedings of the 10th International Workshop on Spoken Language Translation {(IWSLT'13)}},
month = {December},
title = {{QCRI} at {IWSLT} 2013: Experiments in Arabic-English and English-Arabic Spoken Language Translation},
volume = {13},
year = {2013}
}
The AMARA Corpus: Building Resources for Translating the Web's Educational Content
In this paper, we introduce a new parallel corpus of subtitles of educational videos: the AMARA corpus for online educational content. We crawl a multilingual collection community generated subtitles, and present the results of processing the Arabic–English portion of the data, which yields a parallel corpus of about 2.6M Arabic and 3.9M English words. We explore different approaches to align the segments, and extrinsically evaluate the resulting parallel corpus on the standard TED-talks tst-2010. We observe that the data can be successfully used for this task, and also observe an absolute improvement of 1.6 BLEU when it is used in combination with TED data. Finally, we analyze some of the specific challenges when translating the educational content.
The AMARA Corpus: Building Resources for Translating the Web's Educational Content
@inproceedings{guzman:2013:amara,
abstract = {In this paper, we introduce a new parallel corpus of subtitles of educational videos: the AMARA corpus for online educational content. We crawl a multilingual collection community generated subtitles, and present the results of processing the Arabic–English portion of the data, which yields a parallel corpus of about 2.6M Arabic and 3.9M English words. We explore different approaches to align the segments, and extrinsically evaluate the resulting parallel corpus on the standard TED-talks tst-2010. We observe that the data can be successfully used for this task, and also observe an absolute improvement of 1.6 BLEU when it is used in combination with TED data. Finally, we analyze some of the specific challenges when translating the educational content.},
address = {Heidelberg, Germany},
author = {Guzm{\'a}n, Francisco and Sajjad, Hassan and Abdelali, Ahmed and Vogel, Stephan},
booktitle = {Proceedings of the 10th International Workshop on Spoken Language Translation {(IWSLT'13})},
month = {December},
title = {The {AMARA} Corpus: Building Resources for Translating the Web's Educational Content},
volume = {13},
year = {2013}
}
QCRI at WMT12: Experiments in Spanish-English and German-English Machine Translation of News Text
We describe the systems developed by the team of the Qatar Computing Research Institute for the WMT12 Shared Translation Task. We used a phrase-based statistical machine translation model with several non-standard settings, most notably tuning data selection and phrase table combination. The evaluation results show that we rank second in BLEU and TER for Spanish-English, and in the top tier for German-English.
QCRI at WMT12: Experiments in Spanish-English and German-English Machine Translation of News Text
@InProceedings{guzman-EtAl:2012:WMT,
author = {Guzm{\'a}n, Francisco and Nakov, Preslav and Thabet, Ahmed and Vogel, Stephan},
title = {{QCRI} at {WMT}12: Experiments in Spanish-English and German-English Machine Translation of News Text},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation ({WMT'12})},
month = {June},
year = {2012},
address = {Montr{\'e}al, Canada},
publisher = {Association for Computational Linguistics},
pages = {298--303},
url = {http://www.aclweb.org/anthology/W12-3136},
abstract = {We describe the systems developed by the team of the Qatar Computing Research Institute for the WMT12 Shared Translation Task. We used a phrase-based statistical machine translation model with several non-standard settings, most notably tuning data selection and phrase table combination. The evaluation results show that we rank second in BLEU and TER for Spanish-English, and in the top tier for German-English.}
}
Optimizing for Sentence-Level BLEU+1 Yields Short Translations
We study a problem with pairwise ranking optimization (PRO): that it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO’s BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).
Optimizing for Sentence-Level BLEU+1 Yields Short Translations
@inproceedings{nakov-guzman-vogel:2012:PAPERS,
author = {Nakov, Preslav and Guzm{\'a}n, Francisco and Vogel, Stephan},
title = {Optimizing for Sentence-Level {BLEU}+1 Yields Short Translations},
booktitle = {Proceedings of the 24rd International Conference on Computational Linguistics (COLING) 2012 },
month = {December},
year = {2012},
address = {Mumbai, India},
publisher = {The {COLING} 2012 Organizing Committee},
pages = {1979--1994},
url = {http://www.aclweb.org/anthology/C12-1121},
abstract = {We study a problem with pairwise ranking optimization (PRO): that it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO’s BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).}
}
Understanding the Performance of Statistical MT Systems: A Linear Regression Framework
abstract = {We present a framework for the analysis of Machine Translation performance. We use multivariate linear models to determine the impact of a wide range of features on translation performance. Our assumption is that variables that most contribute to predict translation performance are the key to understand the differences between good and bad translations. During training, we learn the regression parameters that better predict translation quality using a wide range of input features based on the translation model and the first-best translation hypotheses. We use a linear regression with regularization. Our results indicate that with regularized linear regression, we can achieve higher levels of correlation between our predicted values and the actual values of the quality metrics. Our analysis shows that the performance for in-domain data is largely dependent on the characteristics of the translation model. On the other hand, out-of domain data can benefit from better reordering strategies.
Understanding the Performance of Statistical MT Systems: A Linear Regression Framework
@InProceedings{guzman-vogel:2012:PAPERS,
author = {Guzm{\'a}n, Francisco and Vogel, Stephan},
title = {Understanding the Performance of Statistical {MT} Systems: A Linear Regression Framework},
booktitle = {Proceedings of the 24rd International Conference on Computational Linguistics {(COLING 2012)},
month = {December},
year = {2012},
address = {Mumbai, India},
pages = {1029--1044},
url = {http://www.aclweb.org/anthology/C12-1063},
abstract = {We present a framework for the analysis of Machine Translation performance. We use multivariate linear models to determine the impact of a wide range of features on translation performance. Our assumption is that variables that most contribute to predict translation performance are the key to understand the differences between good and bad translations. During training, we learn the regression parameters that better predict translation quality using a wide range of input features based on the translation model and the first-best translation hypotheses. We use a linear regression with regularization. Our results indicate that with regularized linear regression, we can achieve higher levels of correlation between our predicted values and the actual values of the quality metrics. Our analysis shows that the performance for in-domain data is largely dependent on the characteristics of the translation model. On the other hand, out-of domain data can benefit from better reordering strategies.
}
2011
Word Alignment Revisited
Francisco Guzmán, Qin Gao, Jan Niehues, and Stephan Vogel.
In
Handbook of Natural Language Processing and Machine Translation: DARPA global autonomous language exploitation
,
pages 164-175.
Joseph Olive, Caitlin Christianson, and John McCary (Eds).
Springer Science & Business Media.
2011. PDFBibTex
Word Alignment Revisited
@incollection{francisco:2011:word, Author = {Francisco Guzm{\'a}n, Qin Gao, Jan Niehues and Stephan Vogel}, Booktitle = {Handbook of Natural Language Processing and Machine Translation: DARPA global autonomous language exploitation.}, Pages = {164--175}, editor = {Olive, Joseph and Christianson, Caitlin and McCary, John}, publisher={Springer Science \& Business Media} Title = {Word Alignment Revisited}, Year = {2011}}
The Impact of Statistical Word Alignment Quality and Structure in Phrase Based Statistical Machine Translation
Statistical Word Alignments represent lexical word-to-word translations between source and target language sentences. They are considered the starting point for many state of the art Statistical Machine Translation (SMT) systems. In phrase-based systems, word alignments are loosely linked to the translation model. Despite the improvements reached in word alignment quality, there has been a modest improvement in the end-to-end translation. Until recently, little or no attention was paid to the structural characteristics of word-alignments (e.g. unaligned words) and their impact in further stages of the phrase-based SMT pipeline. A better understanding of the relationship between word alignment and the entailing processes will help to identify the variables across the pipeline that most influence translation performance and can be controlled by modifying word alignment’s characteristics. In this dissertation, we perform an in-depth study of the impact of word alignments at different stages of the phrase-based statistical machine translation pipeline, namely word alignment, phrase extraction, phrase scoring and decoding. Moreover, we establish a multivariate prediction model for different variables of word alignments, phrase tables and translation hypotheses. Based on those models, we identify the most important alignment variables and propose two alternatives to provide more control over alignment structure and thus improve SMT. Our results show that using alignment structure into decoding, via alignment gap features yields significant improvements, specially in situations where translation data is limited. During the development of this dissertation we discovered how different characteristics of the alignment impact Machine Translation. We observed that while good quality alignments yield good phrase-pairs, the consolidation of a translation model is dependent on the alignment structure, not quality. Human-alignments are more dense than the computer generated counterparts, which trend to be more sparse and precision-oriented. Trying to emulate human-like alignment structure resulted in poorer systems, because the resulting translation models trend to be more compact and lack translation options. On the other hand, more translation options, even if they are noisier, help to improve the quality of the translation. This is due to the fact that translation does not rely only on the translation model, but also other factors that help to discriminate the noise from bad translations (e.g. the language model). Lastly, when we provide the decoder with features that help it to make more `informed decisions'' we observe a clear improvement in translation quality. This was specially true for the discriminative alignments which inherently leave more unaligned words. The result is more evident in low-resource settings where having larger translation lexicons represent more translation options. Using simple features to help the decoder discriminate translation hypotheses, clearly showed consistent improvements.
The Impact of Statistical Word Alignment Quality and Structure in Phrase Based Statistical Machine Translation
@phdthesis{guzman-thesis:2011,
abstract = {Statistical Word Alignments represent lexical word-to-word translations between source and target language sentences. They are considered the starting point for many state of the art Statistical Machine Translation (SMT) systems. In phrase-based systems, word alignments are loosely linked to the translation model. Despite the improvements reached in word alignment quality, there has been a modest improvement in the end-to-end translation. Until recently, little or no attention was paid to the structural characteristics of word-alignments (e.g. unaligned words) and their impact in further stages of the phrase-based SMT pipeline. A better understanding of the relationship between word alignment and the entailing processes will help to identify the variables across the pipeline that most influence translation performance and can be controlled by modifying word alignment’s characteristics.
In this dissertation, we perform an in-depth study of the impact of word alignments at different stages of the phrase-based statistical machine translation pipeline, namely word alignment, phrase extraction, phrase scoring and decoding. Moreover, we establish a multivariate prediction model for different variables of word alignments, phrase tables and translation hypotheses. Based on those models, we identify the most important alignment variables and propose two alternatives to provide more control over alignment structure and thus improve SMT. Our results show that using alignment structure into decoding, via alignment gap features yields significant improvements, specially in situations where translation data is limited. During the development of this dissertation we discovered how different characteristics of the alignment impact Machine Translation. We observed that while good quality alignments yield good phrase-pairs, the consolidation of a translation model is dependent on the alignment structure, not quality. Human-alignments are more dense than the computer generated counterparts, which trend to be more sparse and precision-oriented. Trying to emulate human-like alignment structure resulted in poorer systems, because the resulting translation models trend to be more compact and lack translation options. On the other hand, more translation options, even if they are noisier, help to improve the quality of the translation. This is due to the fact that translation does not rely only on the translation model, but also other factors that help to discriminate the noise from bad translations (e.g. the language model). Lastly, when we provide the decoder with features that help it to make more `informed decisions'' we observe a clear improvement in translation quality. This was specially true for the discriminative alignments which inherently leave more unaligned words. The result is more evident in low-resource settings where having larger translation lexicons represent more translation options. Using simple features to help the decoder discriminate translation hypotheses, clearly showed consistent improvements.},
author = {Guzm{\'a}n, Francisco},
month = {December},
school = {Instituto Tecnológico y de Estudios Superiores de Monterrey, Campus Monterrey},
title = {The Impact of Statistical Word Alignment Quality and Structure in Phrase Based Statistical Machine Translation},
year = {2011}
}
EMDC: a semi-supervised approach for word alignment
This paper proposes a novel semi-supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.
EMDC: a semi-supervised approach for word alignment
@inproceedings{gao:2010:emdc, Author = {Gao, Qin and Guzm{\'a}n, Francisco and Vogel, Stephan}, Booktitle = {Proceedings of the 23rd International Conference on Computational Linguistics {(COLING 2010)}}, Organization = {Association for Computational Linguistics}, Pages = {349--357}, Title = {EMDC: a semi-supervised approach for word alignment},
address = {Beijing, China},
Year = {2010},
Abstract= This paper proposes a novel semi-supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.}
Reassessment of the role of phrase extraction in pbsmt
In this paper we study in detail the relation between word alignment and phrase extraction. First, we analyze different word alignments according to several characteristics and compare them to hand-aligned data. Secondly, we analyzed the phrase-pairs generated by these alignments. We observed that the number of unaligned words has a large impact on the characteristics of the phrase table. A manual evaluation of phrase pair quality showed that the increase in the number of unaligned words results in a lower quality. Finally, we present translation results from using the number of unaligned words as features from which we obtain up to 2BP of improvement.
Reassessment of the role of phrase extraction in pbsmt
@inproceedings{guzman:2009:reassessment,
Author = {Guzm{\'a}n, Francisco and Gao, Qin and Vogel, Stephan},
Booktitle = {The Twelfth Machine Translation Summit ({MTSummit XII})},
Organization = {International Association for Machine Translation},
Title = {Reassessment of the Role of Phrase Extraction in {PBSMT} },
address = {Ottawa, Canada},
Month = {August},
Year = {2009},
Abstract ={In this paper we study in detail the relation between word alignment and phrase extraction. First, we analyze different word alignments according to several characteristics and compare them to hand-aligned data. Secondly, we analyzed the phrase-pairs generated by these alignments. We observed that the number of unaligned words has a large impact on the characteristics of the phrase table. A manual evaluation of phrase pair quality showed that the increase in the number of unaligned words results in a lower quality. Finally, we present translation results from using the number of unaligned words as features from which we obtain up to 2BP of improvement.}}
Translation paraphrases in phrase-based machine translation
In this paper we present an analysis of a phrase-based machine translation methodology that integrates paraphrases obtained from an intermediary language (French) for translations between Spanish and English. The purpose of the research presented in this document is to find out how much extra information (i.e. improvements in translation quality) can be found when using Translation Paraphrases (TPs). In this document we present an extensive statistical analysis to support conclusions.
Translation paraphrases in phrase-based machine translation
@inproceedings{guzman:2008:translation, Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo}, Booktitle = {Computational Linguistics and Intelligent Text Processing {(CICLing'08)}}, Pages = {388--398}, address = {Haifa, Israel}, Title = {Translation paraphrases in phrase-based machine translation}, Publisher = {Springer Berlin Heidelberg}, Month = { February}, Year = {2008}, Abstract = { In this paper we present an analysis of a phrase-based machine translation methodology that integrates paraphrases obtained from an intermediary language (French) for translations between Spanish and English. The purpose of the research presented in this document is to find out how much extra information (i.e. improvements in translation quality) can be found when using Translation Paraphrases (TPs). In this document we present an extensive statistical analysis to support conclusions.}}
Using Translation Paraphrases from Trilingual Corpora to Improve Phrase-Based Statistical Machine Translation: A Preliminary Report
Statistical methods have proven to be very effective when addressing linguistic problems, specially when dealing with Machine Translation. Nevertheless, Statistical Machine Translation effectiveness is limited to situations where large amounts of training data are available. Therefore, the broader the coverage of a SMT system is, the better the chances to get a reasonable output are. In this paper we propose a method to improve quality of translations of a phrase-based Machine Translation system by extending phrase-tables with the use of translation paraphrases learned from a third language. Our experiments were done translating from Spanish to English pivoting through French.
Using Translation Paraphrases from Trilingual Corpora to Improve Phrase-Based Statistical Machine Translation: A Preliminary Report
@inproceedings{guzman:2007:using, Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo}, Booktitle = {Sixth Mexican International Conference on Artificial Intelligence {(MICAI'07)}. }, Organization = {IEEE}, Pages = {163--172}, Title = {Using Translation Paraphrases from Trilingual Corpora to Improve Phrase-Based Statistical Machine Translation: A Preliminary Report}, Month = {November}, address = { Aguascalientes, Mexico}, Year = {2007}, Abstract = {Statistical methods have proven to be very effective when addressing linguistic problems, specially when dealing with Machine Translation. Nevertheless, Statistical Machine Translation effectiveness is limited to situations where large amounts of training data are available. Therefore, the broader the coverage of a SMT system is, the better the chances to get a reasonable output are. In this paper we propose a method to improve quality of translations of a phrase-based Machine Translation system by extending phrase-tables with the use of translation paraphrases learned from a third language. Our experiments were done translating from Spanish to English pivoting through French.}}
Towards Traffic Light Control Through a Cooperative Multiagent System: A Simulation-Based Study
Present day traffic networks are unable to efficiently handle the daily car traffic through urban areas. We think that multiagent systems are an excellent way of doing microscopic simulation and thus provide possible solutions to the traffic control problem. In this paper, we present our simulation-based study to simulate traffic networks and optimize them via a multiagent cooperative system for traffic light control. This system simulates the traffic on an intersection, minimizing the time that each car has to wait in order to be served. Light agents can communicate each other in order to negotiate and share their light times. Our experimental results have shown how our approach can decrease the average car delay while the spawn probability is increased varying the service time and the number of traffic lights sets at a specific intersection. These results show important improvements using our multiagent light control system.
Towards Traffic Light Control Through a Cooperative Multiagent System: A Simulation-Based Study
@inproceedings{guzman:2005:towards,
Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo},
Booktitle = {Proceedings of the 2005 Agent--Directed Simulation Symposium ({ADS05}) at the 2005 Spring Simulation Multiconference ({SpringSim'05})},
Pages = {29--35},
Title = {Towards Traffic Light Control Through a Cooperative Multiagent System: A Simulation--Based Study},
address = { San Diego, California, USA},
month = { April},
Year = {2005},
Abstract = {Present day traffic networks are unable to efficiently handle the daily car traffic through urban areas. We think that multiagent systems are an excellent way of doing microscopic simulation and thus provide possible solutions to the traffic control problem. In this paper, we present our simulation-based study to simulate traffic networks and optimize them via a multiagent cooperative system for traffic light control. This system simulates the traffic on an intersection, minimizing the time that each car has to wait in order to be served. Light agents can communicate each other in order to negotiate and share their light times. Our experimental results have shown how our approach can decrease the average car delay while the spawn probability is increased varying the service time and the number of traffic lights sets at a specific intersection. These results show important improvements using our multiagent light control system.}}
@techreport{guzman:2005:multiagent, Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo}, Institution = {Tecnol{\'o}gico de Monterrey ({ITESM}), Monterrey, NL, M{\'e}xico}, Title = {Multiagent--Based Traffic Simulation}, Journal = {Technical Report CSI-RI-005}, address = { Monterrey, Mexico}, Year = {2005}}
Talks
A Brief introduction to MT Evaluation
I gave a talk "A Brief Introduction to MT Evaluation" at the First MT Marathon in the Americas
.2015
Preparing your summer internship. Hot Summer/Cool Research program.
I gave a talk to my colleagues at QCRI about my recommendations on how to prepare for the summer internship program.2015
Machine Translation: A Game Changer for the Translation Professionals?
Introduction to SMT and post-editing for professional translators. With Stephan Vogel. Sixth Annual International Translation Conference (TII).2015
Método para la Traducción Automática Estadística
Introductory talk to SMT (Spanish). Tecnológico de Orizaba.2009