paco2
  • Home
  • Papers & Talks
  • Projects
  • Mentoring
  • CV

Papers and Talks

Papers

2016
Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement
Hassan Sajjad, Francisco Guzmán, Nadir Durrani, Ahmed Abdelali, Houda Bouamor, Irina Temnikova, and Stephan Vogel. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages 1082-1088, 2016.
PDF Abstract

Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement

In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task: the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.
BibTex

Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement

@InProceedings{sajjad-EtAl:2016:N16-1, author = {Sajjad, Hassan and Guzm\'{a}n, Francisco and Durrani, Nadir and Abdelali, Ahmed and Bouamor, Houda and Temnikova, Irina and Vogel, Stephan}, title = {Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement}, booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies}, month = {June}, year = {2016}, address = {San Diego, California}, publisher = {Association for Computational Linguistics}, pages = {1082--1088}, url = {http://www.aclweb.org/anthology/N16-1125} }
Slides Software/Code
iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking
Ahmed Abdelali, Nadir Durrani, and Francisco Guzmán. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations , pages 17-21, 2016.
PDF BibTex

iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking

@inproceedings{abdelali-durrani-guzman:2016:N16-3, address = {San Diego, California}, author = {Abdelali, Ahmed and Durrani, Nadir and Guzm\'{a}n, Francisco}, booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations}, link = {http://www.aclweb.org/anthology/N16-3004}, month = {June}, pages = {17--21}, publisher = {Association for Computational Linguistics}, title = {iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking}, year = {2016} }
Slides Software/Code
Machine Translation Evaluation Meets Community Question Answering
Francisco Guzmán, Lluís Màrquez, and Preslav Nakov. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics , pages 460-466, 2016.
PDF BibTex

Machine Translation Evaluation Meets Community Question Answering

@inproceedings{guzman-marquez-nakov:2016:P16-2, address = {Berlin, Germany}, author = {Guzm\'{a}n, Francisco and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav}, booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics}, link = {http://anthology.aclweb.org/P16-2075}, month = {August}, pages = {460--466}, publisher = {Association for Computational Linguistics}, title = {Machine Translation Evaluation Meets Community Question Answering}, year = {2016} }
MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering?
Francisco Guzmán, Preslav Nakov, and Lluís Màrquez. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016) , pages 887-895, 2016.
PDF BibTex

MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering?

@inproceedings{guzman-nakov-marquez:2016:SemEval, address = {San Diego, California}, author = {Guzm\'{a}n, Francisco and Nakov, Preslav and M\`{a}rquez, Llu\'{i}s}, booktitle = {Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)}, link = {http://www.aclweb.org/anthology/S16-1137}, month = {June}, pages = {887--895}, publisher = {Association for Computational Linguistics}, title = {MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering?}, year = {2016} }
It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering
Preslav Nakov, Lluís Màrquez, and Francisco Guzmán. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing , pages 1586-1597, 2016.
PDF BibTex

It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering

@inproceedings{nakov-marquez-guzman:2016:EMNLP2016, address = {Austin, Texas}, author = {Nakov, Preslav and M\`{a}rquez, Llu\'{i}s and Guzm\'{a}n, Francisco}, booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing}, link = {https://aclweb.org/anthology/D16-1165}, month = {November}, pages = {1586--1597}, publisher = {Association for Computational Linguistics}, title = {It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering}, year = {2016} }
Normalizing Mathematical Expressions to Improve the Translation of Educational Content
Wajdi Zaghouani, Ahmed Abdelali, Francisco Guzmán, and Hassan Sajjad. In Proceedings of the Workshop on Semitic Machine Translation , pages 20-27, 2016.
PDF BibTex

Normalizing Mathematical Expressions to Improve the Translation of Educational Content

@inproceedings{zaghouani-EtAl:2016:SeMaT, address = {Austin, Texas}, author = {Zaghouani, Wajdi and Abdelali, Ahmed and Guzm\'{a}n, Francisco and Sajjad, Hassan}, booktitle = {Proceedings of the Workshop on Semitic Machine Translation}, link = {http://www.aclweb.org/anthology/W/W05/W05-0204}, month = {November}, pages = {20--27}, publisher = {Association for Computational Linguistics}, title = {Normalizing Mathematical Expressions to Improve the Translation of Educational Content}, year = {2016} }
Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings
Francisco Guzmán, Houda Bouamor, Ramy Baly, and Nizar Habash. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics , pages 1398-1408, 2016.
PDF Abstract

Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings

Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.
BibTex

Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings

@inproceedings{guzman-EtAl:2016:COLING, abstract = {Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.}, address = {Osaka, Japan}, author = {Guzm\'{a}n, Francisco and Bouamor, Houda and Baly, Ramy and Habash, Nizar}, booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers}, link = {http://aclweb.org/anthology/C16-1132}, month = {December}, pages = {1398--1408}, publisher = {The COLING 2016 Organizing Committee}, title = {Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings}, year = {2016} }
Machine translation evaluation with neural networks
Francisco Guzmán, Shafiq Joty, Lluís Màrquez, and Preslav Nakov. In Computer Speech & Language 2016.
PDF Abstract

Machine translation evaluation with neural networks

Abstract We present a framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is embedded into compact distributed vector representations, and fed into a multi-layer neural network that models nonlinear interactions between each of the hypotheses and the reference, as well as between the two hypotheses. We experiment with the benchmark datasets from the \WMT\ Metrics shared task, on which we obtain the best results published so far, with the basic network configuration. We also perform a series of experiments to analyze and understand the contribution of the different components of the network. We evaluate variants and extensions, including fine-tuning of the semantic embeddings, and sentence-based representations modeled with convolutional and recurrent neural networks. In summary, the proposed framework is flexible and generalizable, allows for efficient learning and scoring, and provides an \MT\ evaluation metric that correlates with human judgments, and is on par with the state of the art.
BibTex

Machine translation evaluation with neural networks

@article{GuzmanCSL2016, abstract = {Abstract We present a framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is embedded into compact distributed vector representations, and fed into a multi-layer neural network that models nonlinear interactions between each of the hypotheses and the reference, as well as between the two hypotheses. We experiment with the benchmark datasets from the \{WMT\} Metrics shared task, on which we obtain the best results published so far, with the basic network configuration. We also perform a series of experiments to analyze and understand the contribution of the different components of the network. We evaluate variants and extensions, including fine-tuning of the semantic embeddings, and sentence-based representations modeled with convolutional and recurrent neural networks. In summary, the proposed framework is flexible and generalizable, allows for efficient learning and scoring, and provides an \{MT\} evaluation metric that correlates with human judgments, and is on par with the state of the art.}, author = {Guzm\'{a}n,Francisco and Joty, Shafiq and Màrquez,Lluís and Nakov, Preslav}, doi = {http://dx.doi.org/10.1016/j.csl.2016.12.005}, issn = {0885-2308}, journal = {Computer Speech & Language}, link = {http://www.sciencedirect.com/science/article/pii/S0885230816301693}, title = {Machine translation evaluation with neural networks}, year = {2016} }
2015
Pairwise Neural Machine Translation Evaluation
Francisco Guzmán, Shafiq Joty, Lluís Màrquez, and Preslav Nakov. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and The 7th International Joint Conference of the Asian Federation of Natural Language Processing , pages 805-814, 2015.
PDF Abstract

Pairwise Neural Machine Translation Evaluation

We present a novel framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is compacted into relatively small distributed vector representations, and fed into a multi-layer neural network that models the interaction between each of the hypotheses and the reference, as well as between the two hypotheses. These compact representations are in turn based on word and sentence embeddings, which are learned using neural networks. The framework is flexible, allows for efficient learning and classification, and yields correlation with humans that rivals the state of the art.
BibTex

Pairwise Neural Machine Translation Evaluation

@InProceedings{guzman2015-ACL, author = {Guzm\'{a}n, Francisco and Joty, Shafiq and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav }, title = {Pairwise Neural Machine Translation Evaluation}, booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and The 7th International Joint Conference of the Asian Federation of Natural Language Processing ({ACL}'15)}, month = {July}, year = {2015}, address = {Beijing, China}, publisher = {Association for Computational Linguistics}, pages = {805--814}, url = {http://www.aclweb.org/anthology/P15-1078}, Abstract = {We present a novel framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is compacted into relatively small distributed vector representations, and fed into a multi-layer neural network that models the interaction between each of the hypotheses and the reference, as well as between the two hypotheses. These compact representations are in turn based on word and sentence embeddings, which are learned using neural networks. The framework is flexible, allows for efficient learning and classification, and yields correlation with humans that rivals the state of the art.} }
Slides
The QCN Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT’2015
Hassan Sajjad, Nadir Durrani, Francisco Guzmán, Preslav Nakov, Ahmed Abdelali, Stephan Vogel, Wael Salloum, Ahmed El Kholy, Nizar Habash. In Proceedings of the NIST Open Machine Translation Evaluation Workshop 2015.
PDF Abstract

The QCN Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT’2015

The paper describes the Egyptian Arabic-to-English statistical machine translation (SMT) system that the QCRI-Columbia-NYUAD (QCN) group submitted to the NIST OpenMT'2015 competition. The competition focused on informal dialectal Arabic, as used in SMS, chat, and speech. Thus, our efforts focused on processing and standardizing Arabic, e.g., using tools such as 3arrib and MADAMIRA. We further trained a phrase-based SMT system using state-of-the-art features and components such as operation sequence model, class-based language model, sparse features, neural network joint model, genre-based hierarchically-interpolated language model, unsupervised transliteration mining, phrase-table merging, and hypothesis combination. Our system ranked second on all three genres.
BibTex

The QCN Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT’2015

@Proceedings{sajjad2015-NIST, title={ The QCN Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT’2015}, author={Sajjad, Hassan and Durrani, Nadir and Guzmán, Francisco and Nakov, Preslav and Abdelali, Ahmed and Vogel, Stephan and Salloum, Wael and El Kholy, Ahmed and Habash, Nizar}, year={2015}, abstract={The paper describes the Egyptian Arabic-to-English statistical machine translation (SMT) system that the QCRI-Columbia-NYUAD (QCN) group submitted to the NIST OpenMT'2015 competition. The competition focused on informal dialectal Arabic, as used in SMS, chat, and speech. Thus, our efforts focused on processing and standardizing Arabic, e.g., using tools such as 3arrib and MADAMIRA. We further trained a phrase-based SMT system using state-of-the-art features and components such as operation sequence model, class-based language model, sparse features, neural network joint model, genre-based hierarchically-interpolated language model, unsupervised transliteration mining, phrase-table merging, and hypothesis combination. Our system ranked second on all three genres.}, booktitle={Proceedings of the NIST Open Machine Translation Evaluation Workshop}, publisher={NIST} }
Slides
QAT2 – The QCRI Advanced Transcription and Translation System
Ahmed Abdelali, Ahmed Ali, Francisco Guzmán, Felix Stahlberg, Stephan Vogel, Yifan Zhang. In Proceedings of INTERSPEECH 2015.
PDF Abstract

QAT2 – The QCRI Advanced Transcription and Translation System

QAT2 is a multimedia content translation web service developed by QCRI to help content provider to reach audiences and viewers speaking different languages. It is built with open source technologies such as KALDI, Moses and MaryTTS, to provide a complete translation experience for web users. It translates text content in its original format, and produce translated videos with speech-to-speech translation. The result is a complete native language experience for end users on foreign language websites. The system currently supports translation from Arabic to English.
BibTex

QAT2 – The QCRI Advanced Transcription and Translation System

@inproceedings{abdelali2015-INTERSPEECH, title={{QAT}$^2$--The {QCRI} Advanced Transcription and Translation System}, author={Abdelali, Ahmed and Ali, Ahmed and Guzm{\'a}n, Francisco and Stahlberg, Felix and Vogel, Stephan and Zhang, Yifan}, booktitle={ Proceedings of the 16th Annual Conference of the International Speech Communication Association}, year={2015}, abstract={QAT2 is a multimedia content translation web service developed by QCRI to help content provider to reach audiences and viewers speaking different languages. It is built with open source technologies such as KALDI, Moses and MaryTTS, to provide a complete translation experience for web users. It translates text content in its original format, and pro- duce translated videos with speech-to-speech translation. The result is a complete native language experience for end users on foreign language websites. The system currently supports translation from Arabic to English.} }
Software/Code
Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length
Francisco Guzmán, Preslav Nakov, Stephan Vogel. In Proceedings of the Nineteenth Conference on Computational Natural Language Learning (CoNLL), pages 62-72, 2015.
PDF Abstract

Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length

We study the impact of source length and verbosity of the tuning dataset on the performance of parameter optimizers such as MERT and PRO for statistical machine translation. In particular, we test whether the verbosity of the resulting translations can be modified by varying the length or the verbosity of the tuning sentences. We find that MERT learns the tuning set verbosity very well, while PRO is sensitive to both the verbosity and the length of the source sentences in the tuning set; yet, overall PRO learns best from high-verbosity tuning datasets. Given these dependencies, and potentially some other such as amount of reordering, number of unknown words, syntactic complexity, and evaluation measure, to mention just a few, we argue for the need of controlled evaluation scenarios, so that the selection of tuning set and optimization strategy does not overshadow scientific advances in modeling or decoding. In the mean time, until we develop such controlled scenarios, we recommend using PRO with a large verbosity tuning set, which, in our experiments, yields highest BLEU across datasets and language pairs.
BibTex

Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length

@InProceedings{guzman-nakov-vogel:2015:CoNLL, author = {Guzm\'{a}n, Francisco and Nakov, Preslav and Vogel, Stephan}, title = {Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length}, booktitle = {Proceedings of the Nineteenth Conference on Computational Natural Language Learning}, month = {July}, year = {2015}, address = {Beijing, China}, publisher = {Association for Computational Linguistics}, pages = {62--72}, url = {http://www.aclweb.org/anthology/K15-1007}, abstract = {We study the impact of source length and verbosity of the tuning dataset on the performance of parameter optimizers such as MERT and PRO for statistical machine translation. In particular, we test whether the verbosity of the resulting translations can be modified by varying the length or the verbosity of the tuning sentences. We find that MERT learns the tuning set verbosity very well, while PRO is sensitive to both the verbosity and the length of the source sentences in the tuning set; yet, overall PRO learns best from high-verbosity tuning datasets. Given these dependencies, and potentially some other such as amount of reordering, number of unknown words, syntactic complexity, and evaluation measure, to mention just a few, we argue for the need of controlled evaluation scenarios, so that the selection of tuning set and optimization strategy does not overshadow scientific advances in modeling or decoding. In the mean time, until we develop such controlled scenarios, we recommend using PRO with a large verbosity tuning set, which, in our experiments, yields highest BLEU across datasets and language pairs.} }
Slides
How do Humans Evaluate Machine Translation
Francisco Guzmán, Ahmed Abdelali, Irina Temnikova, Hassan Sajjad, and Stephan Vogel. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 457-466, 2015.
PDF Abstract

How do Humans Evaluate Machine Translation

In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task: the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.
BibTex

How do Humans Evaluate Machine Translation

@inproceedings{guzman-EtAl:2015:WMT, abstract = {In this paper, we take a closer look at the MT evaluation process from a glass-box perspective using eye-tracking. We analyze two aspects of the evaluation task: the background of evaluators (monolingual or bilingual) and the sources of information available, and we evaluate them using time and consistency as criteria. Our findings show that monolinguals are slower but more consistent than bilinguals, especially when only target language information is available. When exposed to various sources of information, evaluators in general take more time and in the case of monolinguals, there is a drop in consistency. Our findings suggest that to have consistent and cost effective MT evaluations, it is better to use monolinguals with only target language information.}, address = {Lisbon, Portugal}, author = {Guzm\'{a}n, Francisco and Abdelali, Ahmed and Temnikova, Irina and Sajjad, Hassan and Vogel, Stephan}, booktitle = {Proceedings of the Tenth Workshop on Statistical Machine Translation}, link = {http://aclweb.org/anthology/W15-3059}, month = {September}, pages = {457--466}, publisher = {Association for Computational Linguistics}, title = {How do Humans Evaluate Machine Translation}, year = {2015} }

Errata | How do Humans Evaluate Machine Translation

few typos were corrected in this version
Errata   Slides Software/Code
2014
The AMARA Corpus: Building parallel language resources for the educational domain
Ahmed Abdelali, Francisco Guzmán, Hassan Sajjad, and Stephan Vogel. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) 2014.
PDF Abstract

The AMARA Corpus: Building parallel language resources for the educational domain

This paper presents the AMARA corpus of on-line educational content: a new parallel corpus of educational video subtitles, multilingually aligned for 20 languages, i.e. 20 monolingual corpora and 190 parallel corpora. This corpus includes both resource-rich languages such as English and Arabic, and resource-poor languages such as Hindi and Thai. In this paper, we describe the gathering, validation, and preprocessing of a large collection of parallel, community-generated subtitles. Furthermore, we describe the methodology used to prepare the data for Machine Translation tasks. Additionally, we provide a document-level, jointly aligned development and test sets for 14 language pairs, designed for tuning and testing Machine Translation systems. We provide baseline results for these tasks, and highlight some of the challenges we face when building machine translation systems for educational content.
BibTex

The AMARA Corpus: Building parallel language resources for the educational domain

@inproceedings{abdelali:2014:amara, abstract = {This paper presents the AMARA corpus of on-line educational content: a new parallel corpus of educational video subtitles, multilingually aligned for 20 languages, i.e. 20 monolingual corpora and 190 parallel corpora. This corpus includes both resource-rich languages such as English and Arabic, and resource-poor languages such as Hindi and Thai. In this paper, we describe the gathering, validation, and preprocessing of a large collection of parallel, community-generated subtitles. Furthermore, we describe the methodology used to prepare the data for Machine Translation tasks. Additionally, we provide a document-level, jointly aligned development and test sets for 14 language pairs, designed for tuning and testing Machine Translation systems. We provide baseline results for these tasks, and highlight some of the challenges we face when building machine translation systems for educational content.}, address = {Reykjavik, Iceland}, author = {Abdelali, Ahmed and Guzm{\'a}n, Francisco and Sajjad, Hassan and Vogel, Stephan}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation ({LREC}'14)}, month = {april}, title = {The {AMARA} Corpus: Building parallel language resources for the educational domain}, year = {2014} }
Slides
Amara: A Sustainable, Global Solution for Accessibility, Powered by Communities of Volunteers
Dean Jansen, Aleli Alcala, and Francisco Guzmán. In Universal Access in Human-Computer Interaction. Design for All and Accessibility Practice, pages 401-411, 2014.
PDF Abstract

Amara: A Sustainable, Global Solution for Accessibility, Powered by Communities of Volunteers

In this paper, we present the main features of the Amara project, and its impact on the accessibility landscape with the use of innovative technology. We also show the effectiveness of volunteer communities in addressing large subtitling and translation tasks, that accompany the ever-growing amounts of online video content. Furthermore, we present two different applications for the platform. First, we examine the growing interest of organizations to build their own subtitling communities. Second, we present how the community-generated material can be used to advance the state-of-the-art of research in fields such as Statistical Machine Translation with focus on educational translation. We provide examples on how both tasks can be achieved successfully.
BibTex

Amara: A Sustainable, Global Solution for Accessibility, Powered by Communities of Volunteers

@inproceedings{jansen:2014:amara, abstract = {In this paper, we present the main features of the Amara project, and its impact on the accessibility landscape with the use of innovative technology. We also show the effectiveness of volunteer communities in addressing large subtitling and translation tasks, that accompany the ever-growing amounts of online video content. Furthermore, we present two different applications for the platform. First, we examine the growing interest of organizations to build their own subtitling communities. Second, we present how the community-generated material can be used to advance the state-of-the-art of research in fields such as Statistical Machine Translation with focus on educational translation. We provide examples on how both tasks can be achieved successfully.}, address = {Heraklion, Greece}, author = {Jansen, Dean and Alcala, Aleli and Guzm{\'a}n, Francisco}, booktitle = {Universal Access in Human-Computer Interaction. Design for All and Accessibility Practice}, month = {June}, pages = {401--411}, publisher = {Springer International Publishing}, title = {Amara: A Sustainable, Global Solution for Accessibility, Powered by Communities of Volunteers}, year = {2014} }
Using Discourse Structure Improves Machine Translation Evaluation
Francisco Guzmán, Shafiq Joty, Lluís Màrquez, and Preslav Nakov. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL'14), pages 687-698, 2014.
PDF Abstract

Using Discourse Structure Improves Machine Translation Evaluation

We present experiments in using discourse structure for improving machine translation evaluation. We first design two discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory. Then, we show that these measures can help improve a number of existing machine translation evaluation metrics both at the segmentand at the system-level. Rather than proposing a single new metric, we show that discourse information is complementary to the state-of-the-art evaluation metrics, and thus should be taken into account in the development of future richer evaluation metrics.
BibTex

Using Discourse Structure Improves Machine Translation Evaluation

@inproceedings{guzman-EtAl:2014:P14-1, abstract = {We present experiments in using discourse structure for improving machine translation evaluation. We first design two discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory. Then, we show that these measures can help improve a number of existing machine translation evaluation metrics both at the segmentand at the system-level. Rather than proposing a single new metric, we show that discourse information is complementary to the state-of-the-art evaluation metrics, and thus should be taken into account in the development of future richer evaluation metrics.}, address = {Baltimore, Maryland, USA}, author = {Guzm\'{a}n, Francisco and Joty, Shafiq and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav}, booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics ({ACL}'14)}, link = {http://www.aclweb.org/anthology/P/P14/P14-1065}, month = {June}, pages = {687--698}, publisher = {Association for Computational Linguistics}, title = {Using Discourse Structure Improves Machine Translation Evaluation}, year = {2014} }
Slides
DiscoTK: Using Discourse Structure for Machine Translation Evaluation
Shafiq Joty, Francisco Guzmán, Lluís Màrquez, and Preslav Nakov. In Proceedings of the Ninth Workshop on Statistical Machine Translation (WMT'14), pages 402-408, 2014.
PDF Abstract

DiscoTK: Using Discourse Structure for Machine Translation Evaluation

We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference. We experiment with five transformations and augmentations of a base discourse tree representation based on the rhetorical structure theory, and we combine the kernel scores for each of them into a single score. Finally, we add other metrics from the ASIYA MT evaluation toolkit, and we tune the weights of the combination on actual human judgments. Experiments on the WMT12 and WMT13 metrics shared task datasets show correlation with human judgments that outperforms what the best systems that participated in these years achieved, both at the segment and at the system level.
BibTex

DiscoTK: Using Discourse Structure for Machine Translation Evaluation

@inproceedings{joty-EtAl:2014:W14-33, abstract = {We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference. We experiment with five transformations and augmentations of a base discourse tree representation based on the rhetorical structure theory, and we combine the kernel scores for each of them into a single score. Finally, we add other metrics from the ASIYA MT evaluation toolkit, and we tune the weights of the combination on actual human judgments. Experiments on the WMT12 and WMT13 metrics shared task datasets show correlation with human judgments that outperforms what the best systems that participated in these years achieved, both at the segment and at the system level.}, address = {Baltimore, Maryland, USA}, author = {Joty, Shafiq and Guzm\'{a}n, Francisco and M\`{a}rquez, Llu\'{i}s and Nakov, Preslav}, booktitle = {Proceedings of the Ninth Workshop on Statistical Machine Translation ({WMT}'14)}, link = {http://www.aclweb.org/anthology/W/W14/W14-3352}, month = {June}, pages = {402--408}, publisher = {Association for Computational Linguistics}, title = {DiscoTK: Using Discourse Structure for Machine Translation Evaluation}, year = {2014} }
Slides
Learning to Differentiate Better from Worse Translations
Francisco Guzmán, Shafiq Joty, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov, and Massimo Nicosia. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 214-220, 2014.
PDF Abstract

Learning to Differentiate Better from Worse Translations

We present a pairwise learning-to-rank approach to machine translation evaluation that learns to differentiate better from worse translations in the context of a given reference. We integrate several layers of linguistic information encapsulated in tree-based structures, making use of both the reference and the system output simultaneously, thus bringing our ranking closer to how humans evaluate translations. Most importantly, instead of deciding upfront which types of features are important, we use the learning framework of preference re-ranking kernels to learn the features automatically. The evaluation results show that learning in the proposed framework yields better correlation with humans than computing the direct similarity over the same type of structures. Also, we show our structural kernel learning (SKL) can be a general framework for MT evaluation, in which syntactic and semantic information can be naturally incorporated.
BibTex

Learning to Differentiate Better from Worse Translations

@inproceedings{guzman-EtAl:2014:EMNLP2014, abstract = {We present a pairwise learning-to-rank approach to machine translation evaluation that learns to differentiate better from worse translations in the context of a given reference. We integrate several layers of linguistic information encapsulated in tree-based structures, making use of both the reference and the system output simultaneously, thus bringing our ranking closer to how humans evaluate translations. Most importantly, instead of deciding upfront which types of features are important, we use the learning framework of preference re-ranking kernels to learn the features automatically. The evaluation results show that learning in the proposed framework yields better correlation with humans than computing the direct similarity over the same type of structures. Also, we show our structural kernel learning (SKL) can be a general framework for MT evaluation, in which syntactic and semantic information can be naturally incorporated.}, address = {Doha, Qatar}, author = {Guzm\'{a}n, Francisco and Joty, Shafiq and M\`{a}rquez, Llu\'{i}s and Moschitti, Alessandro and Nakov, Preslav and Nicosia, Massimo}, booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP'14)}, link = {http://www.aclweb.org/anthology/D14-1027}, month = {October}, pages = {214--220}, publisher = {Association for Computational Linguistics}, title = {Learning to Differentiate Better from Worse Translations}, year = {2014} }
Slides
2013
A Tale about PRO and Monsters
Preslav Nakov, Francisco Guzmán, and Stephan Vogel. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL'13), pages 12-17, 2013.
PDF Abstract

A Tale about PRO and Monsters

While experimenting with tuning on long sentences, we made an unexpected discovery: that PRO falls victim to monsters -overly long negative examples with very low BLEU+1 scores, which are unsuitable for learning and can cause testing BLEU to drop by several points absolute. We propose several effective ways to address the problem, using length- and BLEU+1- based cut-offs, outlier filters, stochastic sampling, and random acceptance. The best of these fixes not only slay and protect against monsters, but also yield higher stability for PRO as well as improved test-time BLEU scores. Thus, we recommend them to anybody using PRO, monster-believer or not.
BibTex

A Tale about PRO and Monsters

@inproceedings{nakov-guzman-vogel:2013:Short, abstract = {While experimenting with tuning on long sentences, we made an unexpected discovery: that PRO falls victim to monsters -overly long negative examples with very low BLEU+1 scores, which are unsuitable for learning and can cause testing BLEU to drop by several points absolute. We propose several effective ways to address the problem, using length- and BLEU+1- based cut-offs, outlier filters, stochastic sampling, and random acceptance. The best of these fixes not only slay and protect against monsters, but also yield higher stability for PRO as well as improved test-time BLEU scores. Thus, we recommend them to anybody using PRO, monster-believer or not.}, address = {Sofia, Bulgaria}, author = {Nakov, Preslav and Guzm{\'a}n, Francisco and Vogel, Stephan}, booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics ({ACL'13})}, link = {http://www.aclweb.org/anthology/P13-2003}, month = {August}, pages = {12--17}, title = {A Tale about {PRO} and Monsters}, year = {2013} }
Slides
Parameter Optimization for Statistical Machine Translation: It Pays to Learn from Hard Examples
Preslav Nakov, Fahad Al Obaidli, Francisco Guzmán, and Stephan Vogel. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP'13) 2013.
PDF Abstract

Parameter Optimization for Statistical Machine Translation: It Pays to Learn from Hard Examples

Research on statistical machine translation has focused on particular translation directions, typically with English as the target language, e.g., from Arabic to English. When we reverse the translation direction, the multiple reference translations turn into multiple possible inputs, which offers both challenges and opportunities. We propose and evaluate several strategies for making use of these multiple inputs: (a) select one of the datasets, (b) select the best input for each sentence, and (c) synthesize an input for each sentence by fusing the available inputs. Surprisingly, we find out that it is best to tune on the hardest available input, not on the one that yields the highest BLEU score. This finding has implications on how to pick good translators and how to select useful data for parameter optimization in SMT.
BibTex

Parameter Optimization for Statistical Machine Translation: It Pays to Learn from Hard Examples

@inproceedings{nakov:2013:parameter, abstract = {Research on statistical machine translation has focused on particular translation directions, typically with English as the target language, e.g., from Arabic to English. When we reverse the translation direction, the multiple reference translations turn into multiple possible inputs, which offers both challenges and opportunities. We propose and evaluate several strategies for making use of these multiple inputs: (a) select one of the datasets, (b) select the best input for each sentence, and (c) synthesize an input for each sentence by fusing the available inputs. Surprisingly, we find out that it is best to tune on the hardest available input, not on the one that yields the highest BLEU score. This finding has implications on how to pick good translators and how to select useful data for parameter optimization in SMT.}, author = {Nakov, Preslav and Al Obaidli, Fahad and Guzm{\'a}n, Francisco and Vogel, Stephan}, booktitle = {Proceedings of the International Conference Recent Advances in Natural Language Processing ({RANLP}'13)}, month = {September}, title = {Parameter Optimization for Statistical Machine Translation: It Pays to Learn from Hard Examples}, year = {2013} }
Slides
QCRI at IWSLT 2013: Experiments in Arabic-English and English-Arabic Spoken Language Translation
Hassan Sajjad, Francisco Guzmán, Preslav Nakov, Ahmed Abdelali, Kenton Murray, Fahad Al Obaidli, and Stephan Vogel. In Proceedings of the 10th International Workshop on Spoken Language Translation (IWSLT'13) 2013.
PDF Abstract

QCRI at IWSLT 2013: Experiments in Arabic-English and English-Arabic Spoken Language Translation

We describe the Arabic-English and English-Arabic statistical machine translation systems developed by the Qatar Computing Research Institute for the IWSLT’2013 evaluation campaign on spoken language translation. We used one phrase-based and two hierarchical decoders, exploring various settings thereof. We further experimented with three domain adaptation methods, and with various Arabic word segmentation schemes. Combining the output of several systems yielded a gain of up to 3.4 BLEU points over the baseline. Here we also describe a specialized normalization scheme for evaluating Arabic output, which was adopted for the IWSLT’2013 evaluation campaign.
BibTex

QCRI at IWSLT 2013: Experiments in Arabic-English and English-Arabic Spoken Language Translation

@inproceedings{sajjad:2013:qcri, abstract = {We describe the Arabic-English and English-Arabic statistical machine translation systems developed by the Qatar Computing Research Institute for the IWSLT’2013 evaluation campaign on spoken language translation. We used one phrase-based and two hierarchical decoders, exploring various settings thereof. We further experimented with three domain adaptation methods, and with various Arabic word segmentation schemes. Combining the output of several systems yielded a gain of up to 3.4 BLEU points over the baseline. Here we also describe a specialized normalization scheme for evaluating Arabic output, which was adopted for the IWSLT’2013 evaluation campaign.}, address = {Heidelberg, Germany}, author = {Sajjad, Hassan and Guzm{\'a}n, Francisco and Nakov, Preslav and Abdelali, Ahmed and Murray, Kenton and Al Obaidli, Fahad and Vogel, Stephan}, booktitle = {Proceedings of the 10th International Workshop on Spoken Language Translation {(IWSLT'13)}}, month = {December}, title = {{QCRI} at {IWSLT} 2013: Experiments in Arabic-English and English-Arabic Spoken Language Translation}, volume = {13}, year = {2013} }
Slides
The AMARA Corpus: Building Resources for Translating the Web's Educational Content
Francisco Guzmán, Hassan Sajjad, Ahmed Abdelali, and Stephan Vogel. In Proceedings of the 10th International Workshop on Spoken Language Translation (IWSLT'13) 2013.
PDF Abstract

The AMARA Corpus: Building Resources for Translating the Web's Educational Content

In this paper, we introduce a new parallel corpus of subtitles of educational videos: the AMARA corpus for online educational content. We crawl a multilingual collection community generated subtitles, and present the results of processing the Arabic–English portion of the data, which yields a parallel corpus of about 2.6M Arabic and 3.9M English words. We explore different approaches to align the segments, and extrinsically evaluate the resulting parallel corpus on the standard TED-talks tst-2010. We observe that the data can be successfully used for this task, and also observe an absolute improvement of 1.6 BLEU when it is used in combination with TED data. Finally, we analyze some of the specific challenges when translating the educational content.
BibTex

The AMARA Corpus: Building Resources for Translating the Web's Educational Content

@inproceedings{guzman:2013:amara, abstract = {In this paper, we introduce a new parallel corpus of subtitles of educational videos: the AMARA corpus for online educational content. We crawl a multilingual collection community generated subtitles, and present the results of processing the Arabic–English portion of the data, which yields a parallel corpus of about 2.6M Arabic and 3.9M English words. We explore different approaches to align the segments, and extrinsically evaluate the resulting parallel corpus on the standard TED-talks tst-2010. We observe that the data can be successfully used for this task, and also observe an absolute improvement of 1.6 BLEU when it is used in combination with TED data. Finally, we analyze some of the specific challenges when translating the educational content.}, address = {Heidelberg, Germany}, author = {Guzm{\'a}n, Francisco and Sajjad, Hassan and Abdelali, Ahmed and Vogel, Stephan}, booktitle = {Proceedings of the 10th International Workshop on Spoken Language Translation {(IWSLT'13})}, month = {December}, title = {The {AMARA} Corpus: Building Resources for Translating the Web's Educational Content}, volume = {13}, year = {2013} }
Slides
2012
QCRI at WMT12: Experiments in Spanish-English and German-English Machine Translation of News Text
Francisco Guzmán, Preslav Nakov, Ahmed Thabet, and Stephan Vogel. In Proceedings of the Seventh Workshop on Statistical Machine Translation (WMT'12), pages 298-303, 2012.
PDF Abstract

QCRI at WMT12: Experiments in Spanish-English and German-English Machine Translation of News Text

We describe the systems developed by the team of the Qatar Computing Research Institute for the WMT12 Shared Translation Task. We used a phrase-based statistical machine translation model with several non-standard settings, most notably tuning data selection and phrase table combination. The evaluation results show that we rank second in BLEU and TER for Spanish-English, and in the top tier for German-English.
BibTex

QCRI at WMT12: Experiments in Spanish-English and German-English Machine Translation of News Text

@InProceedings{guzman-EtAl:2012:WMT, author = {Guzm{\'a}n, Francisco and Nakov, Preslav and Thabet, Ahmed and Vogel, Stephan}, title = {{QCRI} at {WMT}12: Experiments in Spanish-English and German-English Machine Translation of News Text}, booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation ({WMT'12})}, month = {June}, year = {2012}, address = {Montr{\'e}al, Canada}, publisher = {Association for Computational Linguistics}, pages = {298--303}, url = {http://www.aclweb.org/anthology/W12-3136}, abstract = {We describe the systems developed by the team of the Qatar Computing Research Institute for the WMT12 Shared Translation Task. We used a phrase-based statistical machine translation model with several non-standard settings, most notably tuning data selection and phrase table combination. The evaluation results show that we rank second in BLEU and TER for Spanish-English, and in the top tier for German-English.} }
Slides
Optimizing for Sentence-Level BLEU+1 Yields Short Translations
Preslav Nakov, Francisco Guzmán, and Stephan Vogel. In Proceedings of the 24rd International Conference on Computational Linguistics (COLING 2012), pages 1979–1994, 2012.
PDF Abstract

Optimizing for Sentence-Level BLEU+1 Yields Short Translations

We study a problem with pairwise ranking optimization (PRO): that it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO’s BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).
BibTex

Optimizing for Sentence-Level BLEU+1 Yields Short Translations

@inproceedings{nakov-guzman-vogel:2012:PAPERS, author = {Nakov, Preslav and Guzm{\'a}n, Francisco and Vogel, Stephan}, title = {Optimizing for Sentence-Level {BLEU}+1 Yields Short Translations}, booktitle = {Proceedings of the 24rd International Conference on Computational Linguistics (COLING) 2012 }, month = {December}, year = {2012}, address = {Mumbai, India}, publisher = {The {COLING} 2012 Organizing Committee}, pages = {1979--1994}, url = {http://www.aclweb.org/anthology/C12-1121}, abstract = {We study a problem with pairwise ranking optimization (PRO): that it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO’s BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).} }
Slides
Understanding the Performance of Statistical MT Systems: A Linear Regression Framework
Francisco Guzmán, and Stephan Vogel. In Proceedings of the 24rd International Conference on Computational Linguistics (COLING 2012), pages 1029-1044, 2012.
PDF Abstract

Understanding the Performance of Statistical MT Systems: A Linear Regression Framework

abstract = {We present a framework for the analysis of Machine Translation performance. We use multivariate linear models to determine the impact of a wide range of features on translation performance. Our assumption is that variables that most contribute to predict translation performance are the key to understand the differences between good and bad translations. During training, we learn the regression parameters that better predict translation quality using a wide range of input features based on the translation model and the first-best translation hypotheses. We use a linear regression with regularization. Our results indicate that with regularized linear regression, we can achieve higher levels of correlation between our predicted values and the actual values of the quality metrics. Our analysis shows that the performance for in-domain data is largely dependent on the characteristics of the translation model. On the other hand, out-of domain data can benefit from better reordering strategies.
BibTex

Understanding the Performance of Statistical MT Systems: A Linear Regression Framework

@InProceedings{guzman-vogel:2012:PAPERS, author = {Guzm{\'a}n, Francisco and Vogel, Stephan}, title = {Understanding the Performance of Statistical {MT} Systems: A Linear Regression Framework}, booktitle = {Proceedings of the 24rd International Conference on Computational Linguistics {(COLING 2012)}, month = {December}, year = {2012}, address = {Mumbai, India}, pages = {1029--1044}, url = {http://www.aclweb.org/anthology/C12-1063}, abstract = {We present a framework for the analysis of Machine Translation performance. We use multivariate linear models to determine the impact of a wide range of features on translation performance. Our assumption is that variables that most contribute to predict translation performance are the key to understand the differences between good and bad translations. During training, we learn the regression parameters that better predict translation quality using a wide range of input features based on the translation model and the first-best translation hypotheses. We use a linear regression with regularization. Our results indicate that with regularized linear regression, we can achieve higher levels of correlation between our predicted values and the actual values of the quality metrics. Our analysis shows that the performance for in-domain data is largely dependent on the characteristics of the translation model. On the other hand, out-of domain data can benefit from better reordering strategies. }
2011
Word Alignment Revisited
Francisco Guzmán, Qin Gao, Jan Niehues, and Stephan Vogel. In Handbook of Natural Language Processing and Machine Translation: DARPA global autonomous language exploitation , pages 164-175. Joseph Olive, Caitlin Christianson, and John McCary (Eds). Springer Science & Business Media. 2011.
PDF BibTex

Word Alignment Revisited

@incollection{francisco:2011:word, Author = {Francisco Guzm{\'a}n, Qin Gao, Jan Niehues and Stephan Vogel}, Booktitle = {Handbook of Natural Language Processing and Machine Translation: DARPA global autonomous language exploitation.}, Pages = {164--175}, editor = {Olive, Joseph and Christianson, Caitlin and McCary, John}, publisher={Springer Science \& Business Media} Title = {Word Alignment Revisited}, Year = {2011}}
The Impact of Statistical Word Alignment Quality and Structure in Phrase Based Statistical Machine Translation
Francisco Guzmán. 2011.
PDF Abstract

The Impact of Statistical Word Alignment Quality and Structure in Phrase Based Statistical Machine Translation

Statistical Word Alignments represent lexical word-to-word translations between source and target language sentences. They are considered the starting point for many state of the art Statistical Machine Translation (SMT) systems. In phrase-based systems, word alignments are loosely linked to the translation model. Despite the improvements reached in word alignment quality, there has been a modest improvement in the end-to-end translation. Until recently, little or no attention was paid to the structural characteristics of word-alignments (e.g. unaligned words) and their impact in further stages of the phrase-based SMT pipeline. A better understanding of the relationship between word alignment and the entailing processes will help to identify the variables across the pipeline that most influence translation performance and can be controlled by modifying word alignment’s characteristics. In this dissertation, we perform an in-depth study of the impact of word alignments at different stages of the phrase-based statistical machine translation pipeline, namely word alignment, phrase extraction, phrase scoring and decoding. Moreover, we establish a multivariate prediction model for different variables of word alignments, phrase tables and translation hypotheses. Based on those models, we identify the most important alignment variables and propose two alternatives to provide more control over alignment structure and thus improve SMT. Our results show that using alignment structure into decoding, via alignment gap features yields significant improvements, specially in situations where translation data is limited. During the development of this dissertation we discovered how different characteristics of the alignment impact Machine Translation. We observed that while good quality alignments yield good phrase-pairs, the consolidation of a translation model is dependent on the alignment structure, not quality. Human-alignments are more dense than the computer generated counterparts, which trend to be more sparse and precision-oriented. Trying to emulate human-like alignment structure resulted in poorer systems, because the resulting translation models trend to be more compact and lack translation options. On the other hand, more translation options, even if they are noisier, help to improve the quality of the translation. This is due to the fact that translation does not rely only on the translation model, but also other factors that help to discriminate the noise from bad translations (e.g. the language model). Lastly, when we provide the decoder with features that help it to make more `informed decisions'' we observe a clear improvement in translation quality. This was specially true for the discriminative alignments which inherently leave more unaligned words. The result is more evident in low-resource settings where having larger translation lexicons represent more translation options. Using simple features to help the decoder discriminate translation hypotheses, clearly showed consistent improvements.
BibTex

The Impact of Statistical Word Alignment Quality and Structure in Phrase Based Statistical Machine Translation

@phdthesis{guzman-thesis:2011, abstract = {Statistical Word Alignments represent lexical word-to-word translations between source and target language sentences. They are considered the starting point for many state of the art Statistical Machine Translation (SMT) systems. In phrase-based systems, word alignments are loosely linked to the translation model. Despite the improvements reached in word alignment quality, there has been a modest improvement in the end-to-end translation. Until recently, little or no attention was paid to the structural characteristics of word-alignments (e.g. unaligned words) and their impact in further stages of the phrase-based SMT pipeline. A better understanding of the relationship between word alignment and the entailing processes will help to identify the variables across the pipeline that most influence translation performance and can be controlled by modifying word alignment’s characteristics. In this dissertation, we perform an in-depth study of the impact of word alignments at different stages of the phrase-based statistical machine translation pipeline, namely word alignment, phrase extraction, phrase scoring and decoding. Moreover, we establish a multivariate prediction model for different variables of word alignments, phrase tables and translation hypotheses. Based on those models, we identify the most important alignment variables and propose two alternatives to provide more control over alignment structure and thus improve SMT. Our results show that using alignment structure into decoding, via alignment gap features yields significant improvements, specially in situations where translation data is limited. During the development of this dissertation we discovered how different characteristics of the alignment impact Machine Translation. We observed that while good quality alignments yield good phrase-pairs, the consolidation of a translation model is dependent on the alignment structure, not quality. Human-alignments are more dense than the computer generated counterparts, which trend to be more sparse and precision-oriented. Trying to emulate human-like alignment structure resulted in poorer systems, because the resulting translation models trend to be more compact and lack translation options. On the other hand, more translation options, even if they are noisier, help to improve the quality of the translation. This is due to the fact that translation does not rely only on the translation model, but also other factors that help to discriminate the noise from bad translations (e.g. the language model). Lastly, when we provide the decoder with features that help it to make more `informed decisions'' we observe a clear improvement in translation quality. This was specially true for the discriminative alignments which inherently leave more unaligned words. The result is more evident in low-resource settings where having larger translation lexicons represent more translation options. Using simple features to help the decoder discriminate translation hypotheses, clearly showed consistent improvements.}, author = {Guzm{\'a}n, Francisco}, month = {December}, school = {Instituto Tecnológico y de Estudios Superiores de Monterrey, Campus Monterrey}, title = {The Impact of Statistical Word Alignment Quality and Structure in Phrase Based Statistical Machine Translation}, year = {2011} }
Slides
2010
EMDC: a semi-supervised approach for word alignment
Qin Gao, Francisco Guzmán, and Stephan Vogel. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), pages 349-357, 2010.
PDF Abstract

EMDC: a semi-supervised approach for word alignment

This paper proposes a novel semi-supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.
BibTex

EMDC: a semi-supervised approach for word alignment

@inproceedings{gao:2010:emdc, Author = {Gao, Qin and Guzm{\'a}n, Francisco and Vogel, Stephan}, Booktitle = {Proceedings of the 23rd International Conference on Computational Linguistics {(COLING 2010)}}, Organization = {Association for Computational Linguistics}, Pages = {349--357}, Title = {EMDC: a semi-supervised approach for word alignment}, address = {Beijing, China}, Year = {2010}, Abstract= This paper proposes a novel semi-supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.}
2009
Reassessment of the role of phrase extraction in pbsmt
Francisco Guzmán, Qin Gao and Stephan Vogel. In Machine Translation Summit XII 2009.
PDF Abstract

Reassessment of the role of phrase extraction in pbsmt

In this paper we study in detail the relation between word alignment and phrase extraction. First, we analyze different word alignments according to several characteristics and compare them to hand-aligned data. Secondly, we analyzed the phrase-pairs generated by these alignments. We observed that the number of unaligned words has a large impact on the characteristics of the phrase table. A manual evaluation of phrase pair quality showed that the increase in the number of unaligned words results in a lower quality. Finally, we present translation results from using the number of unaligned words as features from which we obtain up to 2BP of improvement.
BibTex

Reassessment of the role of phrase extraction in pbsmt

@inproceedings{guzman:2009:reassessment, Author = {Guzm{\'a}n, Francisco and Gao, Qin and Vogel, Stephan}, Booktitle = {The Twelfth Machine Translation Summit ({MTSummit XII})}, Organization = {International Association for Machine Translation}, Title = {Reassessment of the Role of Phrase Extraction in {PBSMT} }, address = {Ottawa, Canada}, Month = {August}, Year = {2009}, Abstract ={In this paper we study in detail the relation between word alignment and phrase extraction. First, we analyze different word alignments according to several characteristics and compare them to hand-aligned data. Secondly, we analyzed the phrase-pairs generated by these alignments. We observed that the number of unaligned words has a large impact on the characteristics of the phrase table. A manual evaluation of phrase pair quality showed that the increase in the number of unaligned words results in a lower quality. Finally, we present translation results from using the number of unaligned words as features from which we obtain up to 2BP of improvement.}}
Slides
2008
Translation paraphrases in phrase-based machine translation
Francisco Guzmán, and Leonardo Garrido. In Computational Linguistics and Intelligent Text Processing (CICLing'08), pages 388-398, 2008.
PDF Abstract

Translation paraphrases in phrase-based machine translation

In this paper we present an analysis of a phrase-based machine translation methodology that integrates paraphrases obtained from an intermediary language (French) for translations between Spanish and English. The purpose of the research presented in this document is to find out how much extra information (i.e. improvements in translation quality) can be found when using Translation Paraphrases (TPs). In this document we present an extensive statistical analysis to support conclusions.
BibTex

Translation paraphrases in phrase-based machine translation

@inproceedings{guzman:2008:translation, Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo}, Booktitle = {Computational Linguistics and Intelligent Text Processing {(CICLing'08)}}, Pages = {388--398}, address = {Haifa, Israel}, Title = {Translation paraphrases in phrase-based machine translation}, Publisher = {Springer Berlin Heidelberg}, Month = { February}, Year = {2008}, Abstract = { In this paper we present an analysis of a phrase-based machine translation methodology that integrates paraphrases obtained from an intermediary language (French) for translations between Spanish and English. The purpose of the research presented in this document is to find out how much extra information (i.e. improvements in translation quality) can be found when using Translation Paraphrases (TPs). In this document we present an extensive statistical analysis to support conclusions.}}
Slides
2007
Using Translation Paraphrases from Trilingual Corpora to Improve Phrase-Based Statistical Machine Translation: A Preliminary Report
Francisco Guzmán, and Leonardo Garrido. In Sixth Mexican International Conference on Artificial Intelligence (MICAI-07), pages 163-172, 2007.
PDF Abstract

Using Translation Paraphrases from Trilingual Corpora to Improve Phrase-Based Statistical Machine Translation: A Preliminary Report

Statistical methods have proven to be very effective when addressing linguistic problems, specially when dealing with Machine Translation. Nevertheless, Statistical Machine Translation effectiveness is limited to situations where large amounts of training data are available. Therefore, the broader the coverage of a SMT system is, the better the chances to get a reasonable output are. In this paper we propose a method to improve quality of translations of a phrase-based Machine Translation system by extending phrase-tables with the use of translation paraphrases learned from a third language. Our experiments were done translating from Spanish to English pivoting through French.
BibTex

Using Translation Paraphrases from Trilingual Corpora to Improve Phrase-Based Statistical Machine Translation: A Preliminary Report

@inproceedings{guzman:2007:using, Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo}, Booktitle = {Sixth Mexican International Conference on Artificial Intelligence {(MICAI'07)}. }, Organization = {IEEE}, Pages = {163--172}, Title = {Using Translation Paraphrases from Trilingual Corpora to Improve Phrase-Based Statistical Machine Translation: A Preliminary Report}, Month = {November}, address = { Aguascalientes, Mexico}, Year = {2007}, Abstract = {Statistical methods have proven to be very effective when addressing linguistic problems, specially when dealing with Machine Translation. Nevertheless, Statistical Machine Translation effectiveness is limited to situations where large amounts of training data are available. Therefore, the broader the coverage of a SMT system is, the better the chances to get a reasonable output are. In this paper we propose a method to improve quality of translations of a phrase-based Machine Translation system by extending phrase-tables with the use of translation paraphrases learned from a third language. Our experiments were done translating from Spanish to English pivoting through French.}}
Slides
2005
Towards Traffic Light Control Through a Cooperative Multiagent System: A Simulation-Based Study
Francisco Guzmán, and Leonardo Garrido. In Proceedings of the 2005 Agent-Directed Simulation Symposium (ADS05) at the 2005 Spring Simulation Multiconference (SpringSim'05), pages 29-35, 2005.
PDF Abstract

Towards Traffic Light Control Through a Cooperative Multiagent System: A Simulation-Based Study

Present day traffic networks are unable to efficiently handle the daily car traffic through urban areas. We think that multiagent systems are an excellent way of doing microscopic simulation and thus provide possible solutions to the traffic control problem. In this paper, we present our simulation-based study to simulate traffic networks and optimize them via a multiagent cooperative system for traffic light control. This system simulates the traffic on an intersection, minimizing the time that each car has to wait in order to be served. Light agents can communicate each other in order to negotiate and share their light times. Our experimental results have shown how our approach can decrease the average car delay while the spawn probability is increased varying the service time and the number of traffic lights sets at a specific intersection. These results show important improvements using our multiagent light control system.
BibTex

Towards Traffic Light Control Through a Cooperative Multiagent System: A Simulation-Based Study

@inproceedings{guzman:2005:towards, Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo}, Booktitle = {Proceedings of the 2005 Agent--Directed Simulation Symposium ({ADS05}) at the 2005 Spring Simulation Multiconference ({SpringSim'05})}, Pages = {29--35}, Title = {Towards Traffic Light Control Through a Cooperative Multiagent System: A Simulation--Based Study}, address = { San Diego, California, USA}, month = { April}, Year = {2005}, Abstract = {Present day traffic networks are unable to efficiently handle the daily car traffic through urban areas. We think that multiagent systems are an excellent way of doing microscopic simulation and thus provide possible solutions to the traffic control problem. In this paper, we present our simulation-based study to simulate traffic networks and optimize them via a multiagent cooperative system for traffic light control. This system simulates the traffic on an intersection, minimizing the time that each car has to wait in order to be served. Light agents can communicate each other in order to negotiate and share their light times. Our experimental results have shown how our approach can decrease the average car delay while the spawn probability is increased varying the service time and the number of traffic lights sets at a specific intersection. These results show important improvements using our multiagent light control system.}}
Slides
Multiagent-Based Traffic Simulation
Francisco Guzmán, and Leonardo Garrido. 2005.
BibTex

Multiagent-Based Traffic Simulation

@techreport{guzman:2005:multiagent, Author = {Guzm{\'a}n, Francisco and Garrido, Leonardo}, Institution = {Tecnol{\'o}gico de Monterrey ({ITESM}), Monterrey, NL, M{\'e}xico}, Title = {Multiagent--Based Traffic Simulation}, Journal = {Technical Report CSI-RI-005}, address = { Monterrey, Mexico}, Year = {2005}}

Talks

A Brief introduction to MT Evaluation
I gave a talk "A Brief Introduction to MT Evaluation" at the First MT Marathon in the Americas .2015
Preparing your summer internship. Hot Summer/Cool Research program.
I gave a talk to my colleagues at QCRI about my recommendations on how to prepare for the summer internship program.2015
Machine Translation: A Game Changer for the Translation Professionals?
Introduction to SMT and post-editing for professional translators. With Stephan Vogel. Sixth Annual International Translation Conference (TII).2015
Método para la Traducción Automática Estadística
Introductory talk to SMT (Spanish). Tecnológico de Orizaba.2009

Last updated January 29, 2017.
Created with git, jekyll, bootstrap, and sublime text.
Website template available at github fork from Adam Lopez's site.