Laboratory For Computational Studies of Language

Natural Language Processing (also known as Computational Linguistics) aims to bring together technological and scientific studies of human languages, in particular, their processing by machine, with the dual goals of advancing language technology and understanding the computational basis of human language capacity.

The Laboratory for Computational Studies of Language situates itself in the interaction of computer science, cognitive science and linguistics. Established in 1994, the lab conducted research on developing morphological analyzers, syntactic processors and machine translation systems, using rule-driven methodology in the first ten years. More recently, the lab has become involved in data intensive linguistics, such as linguistic annotation for morphology, syntax, intonation and discourse, aiming to combine data-driven techniques with rule-based systems. Current projects of staff members on and off campus include search engines, statistical models of grammar acqusition, wide-coverage parsing and discourse grammar development.

We have completed 2 internationally sponsored, and 2 nationally sponsored projects. The lab's list of graduates includes more than 25 masters theses and 3 PhDs.

Projects

  • Language pairing on functional structure: LFG-based MT for English↔Turkish. (1999-2001). Sponsored by AppTek/Lernout & Hauspie.
  • Turkish NLP Initiative (1994-1998). Sponsored by NATO Science Division SfS III. PI: Kemal Oflazer. Co-PI: Cem Bozsahin.
  • A Grammar Architecture for Computational Analysis of Turkish. (1993-1995). Sponsored by TUBITAK. PI: Cem Bozsahin.

Selected Publications

  • Bozsahin, Cem The Combinatory Morphemic Lexicon. Computational Linguistics, 28(2):145-186, 2002.
  • O. Yuksel, C. Bozsahin, Contextually Appropriate Reference Generation, Natural Language Engineering, 8(1):69-89, 2002.
  • Birturk Aysenur Akyuz, Sandiway Fong A Modular Approach to Turkish Noun Compounding: The Integration of a Finite-State Model Proceedings of the 6th Natural Language Processing Pacific Rim Symposium (NLPRS2001) November 27-29, 2001, Tokyo, Japan.
  • B. Karagol-Ayan, Morphosyntactic Generation of Turkish from Predicate-Argument Structure Proceedings of the COLING 2000 Student Session, Hong Kong.
  • Cem Bozsahin, Gapping and Word Order in Turkish. Proc. of 10th Int. Conf. on Turkish Linguistics (ICTL 2000), Istanbul, 2000.
  • C. Bozsahin, D. Zeyrek, Dilbilgisi, bilisim ve bilissel bilim [Grammar, Computation and Cognitive Science] In Dilbilim Arastirmalari 2000 [Research in Linguistics, vol.11]
  • O. Sehitoglu, C. Bozsahin, Lexical Rules and Lexical Organization in Breadth and Depth of Semantic Lexicons, E. Viegas (ed.), Kluwer, 1999.
  • C. Bozsahin, Deriving the Predicate-Argument Structure for a Free Word Order Language Proceedings of COLING-ACL'98, pp. 167-173, Montreal. I. Pembeci, C. Bozsahin, D. Zeyrek , Computer-aided Learning of Turkish Morphology Proceedings of Joint Conf. of ACH and ALLC, 1998, July, Debrecen, Hungary.
  • C. (Keyder) Turhan, An English to Turkish Machine Translation System Using Structural Mapping Proc. of the Fifth Conference on Applied NLP (ANLP'97), 31 March - 3 April, 1997, Washington, D.C.
  • C. Bozsahin, Ulamsal dilbilgisi ve Turkce [Categorial Grammar and Turkish] Dilbilim Arastirmalari 1996 [Research in Linguistics] 7:230-244
  • C. Bozsahin, E. Gocmen, A Categorial Framework for Composition in Multiple Linguistic Domains in Proc. of the 4th Int Conf on Cognitive Science of NLP, Dublin, CSNLP'95.
  • Küçük, Dilek and Turhan-Yöndem, Meltem. 2007. 'A Knowledge-poor Pronoun Resolution System for Turkish', In Proceedings of the 6th Discourse Anaphora and Anaphora Resolution Colloquium. Lagos, Portugal.
  • Küçük, Dilek and Turhan-Yöndem, Meltem. 2007. 'Automatic Identification of Pronominal Anaphora in Turkish Texts', In Proceedings of the 22nd International Symposium on Computer and Information Sciences. Ankara, Turkey, to appear.
  • O. Bayer, T. Çiloğlu, M. Turhan Yöndem, “Investigation of Different Language Models for Turkish Speech Recognition”, Proceedings of the IEEE 14th Signal Processing and Communications Applications Conference, 2006.
  • M. Turhan Yöndem, G. Ucoluk “A Realistic Success Criterion for Discourse Segmentation” , In Proceedings of ISCIS-2003, LNCS 2869, pp:592, Springer Verlag, 2003.