Cross-Lingual Eigenwords is a cross-lingual word embedding method, based on spectral graph embedding.

biplot of word vectors

The above figure is 2-dim visualization (by PCA) of CL-Eigenwords word vectors of names of countries and its languages in English (blue) and Spanish (red). Words of each (country, language) pair (ex. (italy, italian) ) are connected by a line segment.

Publications

  • Oshikiri, T., Fukui, K., & Shimodaira, H. (2016). Cross-Lingual Word Representations via Spectral Graph Embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 493–498). Berlin, Germany: Association for Computational Linguistics.

Slides

  • Shimodaira, H. (2014). A simple coding for cross-domain matching with dimension reduction via spectral graph embedding. arXiv preprint arXiv:1412.8380.
  • Wang, R., Zhao, H., Ploux, S., Lu, B., Utiyama, M., Sumita, E. (2016) A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique. arXiv preprint arXiv:1607.08692.