Year: 2016

  1. Georgios Petasis and Vangelis Karkaletsis.
    Identifying Argument Components through TextRank.
    In Proceedings of the 3rd Workshop on Argument Mining (ArgMining2016). August 2016, 56–66.
    URL BibTeX

    @inproceedings{Petasis-EtAl:2016:ARG-MINING,
    	author = "Georgios Petasis and Vangelis Karkaletsis",
    	title = "Identifying Argument Components through TextRank",
    	booktitle = "Proceedings of the 3rd Workshop on Argument Mining (ArgMining2016)",
    	month = "August",
    	year = 2016,
    	address = "Berlin, Germany",
    	publisher = "Association for Computational Linguistics",
    	pages = "56--66",
    	url = "http://aclweb.org/anthology/W/W16/W16-2811.pdf"
    }
    
  2. Ioannis Manousos Katakis, Georgios Petasis and Vangelis Karkaletsis.
    CLARIN-EL Web-based Annotation Tool.
    In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk and Stelios Piperidis (eds.). Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016.. 2016.
    URL BibTeX

    @inproceedings{DBLP:conf/lrec/KatakisPK16,
    	author = "Ioannis Manousos Katakis and Georgios Petasis and Vangelis Karkaletsis",
    	title = "{CLARIN-EL} Web-based Annotation Tool",
    	booktitle = "Proceedings of the Tenth International Conference on Language Resources and Evaluation {LREC} 2016, Portoro{\v{z}}, Slovenia, May 23-28, 2016.",
    	editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and H{\'{e}}l{\`{e}}ne Mazo and Asunci{\'{o}}n Moreno and Jan Odijk and Stelios Piperidis",
    	year = 2016,
    	publisher = "European Language Resources Association {(ELRA)}",
    	url = "http://www.lrec-conf.org/proceedings/lrec2016/summaries/990.html",
    	timestamp = "Tue, 30 Aug 2016 18:49:47 +0200",
    	biburl = "http://dblp.uni-trier.de/rec/bib/conf/lrec/KatakisPK16",
    	bibsource = "dblp computer science bibliography, http://dblp.org"
    }
    

Year: 2015

  1. Theodosis Goudas, Christos Louizos, Georgios Petasis and Vangelis Karkaletsis.
    Argument Extraction from News, Blogs, and the Social Web.
    International Journal on Artificial Intelligence Tools 24(05):1540024, 2015.
    URL, DOI BibTeX

    @article{doi:10.1142/S0218213015400242,
    	author = "Goudas, Theodosis and Louizos, Christos and Petasis, Georgios and Karkaletsis, Vangelis",
    	title = "Argument Extraction from News, Blogs, and the Social Web",
    	journal = "International Journal on Artificial Intelligence Tools",
    	volume = 24,
    	number = 05,
    	pages = 1540024,
    	year = 2015,
    	doi = "10.1142/S0218213015400242",
    	url = "http://www.worldscientific.com/doi/abs/10.1142/S0218213015400242",
    	eprint = {http://www.worldscientific.com/doi/pdf/10.1142/S0218213015400242} abstract = {Argument extraction is the task of identifying arguments, along with their components in text. Arguments can be usually decomposed into a claim and one or more premises justifying it. Among the novel aspects of this work is the thematic domain itself which relates to Social Media, in contrast to traditional research in the area, which concentrates mainly on law documents and scientific publications. The huge increase of social media communities, along with their user tendency to debate, makes the identification of arguments in these texts a necessity. Argument extraction from Social Media is more challenging because texts may not always contain arguments, as is the case of legal documents or scientific publications usually studied. In addition, being less formal in nature, texts in Social Media may not even have proper syntax or spelling. This paper presents a two-step approach for argument extraction from social media texts. During the first step, the proposed approach tries to classify the sentences into "sentences that contain arguments" and "sentences that don’t contain arguments". In the second step, it tries to identify the exact fragments that contain the premises from the sentences that contain arguments, by utilizing conditional random fields. The results exceed significantly the base line approach, and according to literature, are quite promising.}
    }
    
  2. Anastasia Krithara, George Giannakopoulos, George Paliouras, George Petasis and Vangelis Karkaletsis.
    Predicting Sentiment using Tranfer Learning.
    In Workshop on Replicability and Reproducibility in Natural Language Processing: adaptive methods, resources and software at IJCAI 2015 (AdaptiveNLP 2015). 2015.
    URL BibTeX

    @inproceedings{ref35,
    	author = "Anastasia Krithara and George Giannakopoulos and George Paliouras and George Petasis and Vangelis Karkaletsis",
    	title = "Predicting Sentiment using Tranfer Learning",
    	year = 2015,
    	booktitle = "Workshop on Replicability and Reproducibility in Natural Language Processing: adaptive methods, resources and software at IJCAI 2015 (AdaptiveNLP 2015)",
    	abstract = "A new transfer learning method is presented in this paper, addressing the task of sentiment analysis across domains.The proposed approach is a transfer variant of the Probabilistic Latent Semantic Analysis (PLSA) model that we name KLIEP-PLSA. The approach captures the difference of the tributions between the different domains. We perform experiments over well known datasets and show the promising results that we obtained new method.",
    	keywords = {transfer learning,KLIEP-PLSA,PLSA,sentiment analysis" url = {https://sites.google.com/site/adaptivenlp2015/},
    	url = "http://www.ellogon.org/petasis/bibliography/IJCAI2015/Krithara-TL_sentiment.pdf"
    }
    
  3. Christos Sardianos, Ioannis Manousos Katakis, Georgios Petasis and Vangelis Karkaletsis.
    Argument Extraction from News.
    In Proceedings of the 2nd Workshop on Argumentation Mining. June 2015, 56–66.
    URL BibTeX

    @inproceedings{sardianos-EtAl:2015:ARG-MINING,
    	author = "Sardianos, Christos and Katakis, Ioannis Manousos and Petasis, Georgios and Karkaletsis, Vangelis",
    	title = "Argument Extraction from News",
    	booktitle = "Proceedings of the 2nd Workshop on Argumentation Mining",
    	month = "June",
    	year = 2015,
    	address = "Denver, CO",
    	publisher = "Association for Computational Linguistics",
    	pages = "56--66",
    	url = "http://www.aclweb.org/anthology/W15-0508"
    }
    

Year: 2014

  1. Georgios Petasis.
    Annotating Arguments: The NOMAD Collaborative Annotation Tool.
    In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk and Stelios Piperidis (eds.). Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014. 2014, 1930-1937.
    BibTeX

    @inproceedings{DBLP:conf/lrec/Petasis14,
    	author = "Georgios Petasis",
    	title = "Annotating Arguments: The NOMAD Collaborative Annotation Tool",
    	booktitle = "Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014",
    	pages = "1930-1937",
    	ee = "http://www.lrec-conf.org/proceedings/lrec2014/summaries/669.html",
    	editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asunci{\'o}n Moreno and Jan Odijk and Stelios Piperidis",
    	publisher = "European Language Resources Association (ELRA)",
    	year = 2014,
    	bibsource = "DBLP, http://dblp.uni-trier.de"
    }
    
  2. Georgios Petasis.
    The Ellogon Pattern Engine: Context-free Grammars over Annotations.
    In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk and Stelios Piperidis (eds.). Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014. 2014, 2460-2465.
    BibTeX

    @inproceedings{DBLP:conf/lrec/Petasis14a,
    	author = "Georgios Petasis",
    	title = "The Ellogon Pattern Engine: Context-free Grammars over Annotations",
    	pages = "2460-2465",
    	ee = "http://www.lrec-conf.org/proceedings/lrec2014/summaries/1060.html",
    	booktitle = "Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014",
    	editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asunci{\'o}n Moreno and Jan Odijk and Stelios Piperidis",
    	publisher = "European Language Resources Association (ELRA)",
    	year = 2014,
    	bibsource = "DBLP, http://dblp.uni-trier.de"
    }
    
  3. George Kiomourtzis, George Giannakopoulos, Georgios Petasis, Pythagoras Karampiperis and Vangelis Karkaletsis.
    NOMAD: Linguistic Resources and Tools Aimed at Policy Formulation and Validation.
    In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk and Stelios Piperidis (eds.). Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014. 2014, 3464-3470.
    BibTeX

    @inproceedings{DBLP:conf/lrec/KiomourtzisGPKK14,
    	author = "George Kiomourtzis and George Giannakopoulos and Georgios Petasis and Pythagoras Karampiperis and Vangelis Karkaletsis",
    	title = "NOMAD: Linguistic Resources and Tools Aimed at Policy Formulation and Validation",
    	pages = "3464-3470",
    	ee = "http://www.lrec-conf.org/proceedings/lrec2014/summaries/813.html",
    	booktitle = "Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014",
    	editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asunci{\'o}n Moreno and Jan Odijk and Stelios Piperidis",
    	publisher = "European Language Resources Association (ELRA)",
    	year = 2014,
    	bibsource = "DBLP, http://dblp.uni-trier.de"
    }
    
  4. Aristidis Likas, Konstantinos Blekas and Dimitris Kalles (eds.).
    Argument Extraction from News, Blogs, and Social Media
    .
    pages 287–299, Springer International Publishing, 2014.
    URL, DOI BibTeX

    @inbook{Goudas2014,
    	author = "Goudas, Theodosis and Louizos, Christos and Petasis, Georgios and Karkaletsis, Vangelis",
    	editor = "Likas, Aristidis and Blekas, Konstantinos and Kalles, Dimitris",
    	title = "Argument Extraction from News, Blogs, and Social Media",
    	booktitle = "Artificial Intelligence: Methods and Applications: 8th Hellenic Conference on AI, SETN 2014, Ioannina, Greece, May 15-17, 2014. Proceedings",
    	year = 2014,
    	publisher = "Springer International Publishing",
    	address = "Cham",
    	pages = "287--299",
    	isbn = "978-3-319-07064-3",
    	doi = "10.1007/978-3-319-07064-3_23",
    	url = "http://dx.doi.org/10.1007/978-3-319-07064-3_23"
    }
    
  5. Georgios Petasis, Dimitris Spiliotopoulos, Nikos Tsirakis and Panayotis Tsantilas.
    Sentiment Analysis for Reputation Management: Mining the Greek Web.
    In Aristidis Likas, Konstantinos Blekas and Dimitris Kalles (eds.). Artificial Intelligence: Methods and Applications - 8th Hellenic Conference on AI, SETN 2014, Ioannina, Greece, May 15-17, 2014. Proceedings 8445. 2014, 327-340.
    BibTeX

    @inproceedings{SETN-2014-Petasis,
    	author = "Georgios Petasis and Dimitris Spiliotopoulos and Nikos Tsirakis and Panayotis Tsantilas",
    	title = "Sentiment Analysis for Reputation Management: Mining the Greek Web",
    	editor = "Aristidis Likas and Konstantinos Blekas and Dimitris Kalles",
    	booktitle = "Artificial Intelligence: Methods and Applications - 8th Hellenic Conference on AI, SETN 2014, Ioannina, Greece, May 15-17, 2014. Proceedings",
    	year = 2014,
    	pages = "327-340",
    	publisher = "Springer",
    	series = "Lecture Notes in Computer Science",
    	volume = 8445,
    	ee = "http://dx.doi.org/10.1007/978-3-319-07064-3_26",
    	isbn = "978-3-319-07063-6, 978-3-319-07064-3",
    	bibsource = "DBLP, http://dblp.uni-trier.de"
    }
    

Year: 2013

  1. Georgios Petasis, Dimitrios Spiliotopoulos, Nikos Tsirakis and Panayiotis Tsantilas.
    Large-scale Sentiment Analysis for Reputation Management.
    In Stefan Gindl, Robert Remus and Michael Wiegand (eds.). Proceedings of the 2nd Workshop on Practice and Theory of Opinion Mining and Sentiment Analysis (PATHOS-2013). 2013.
    BibTeX

    @inproceedings{pathos-2013-Petasis,
    	author = "Georgios Petasis and Dimitrios Spiliotopoulos and Nikos Tsirakis and Panayiotis Tsantilas",
    	abstract = "Harvesting the web and social web data is a meticulous and complex task. Applying the results to a successful business case such as brand monitoring requires high precision and recall for the opinion mining and entity recognition tasks. This work reports on the integrated platform of a state of the art Named-entity Recognition and Classification (NERC) system and opinion mining methods for a Software-as-a-Service (SaaS) approach on a fully automatic service for brand monitoring for the Greek language. The service has been successfully deployed to the biggest search engine in Greece powering the large-scale linguistic and sentiment analysis of about 80.000 resources per hour.",
    	title = "Large-scale Sentiment Analysis for Reputation Management",
    	booktitle = "Proceedings of the 2nd Workshop on Practice and Theory of Opinion Mining and Sentiment Analysis (PATHOS-2013)",
    	address = "Darmstadt, Germany",
    	month = "September 23",
    	year = 2013,
    	editor = "Stefan Gindl and Robert Remus and Michael Wiegand"
    }
    
  2. Georgios Petasis.
    Structuring the Blogosphere on News from Traditional Media.
    In On the Move to Meaningful Internet Systems: OTM 2013 Workshops - Confederated International Workshops: OTM Academy, OTM Industry Case Studies Program, ACM, EI2N, ISDE, META4eS, ORM, SeDeS, SINCOM, SMS, and SOMOCO 2013 8186. 2013, 608-617.
    BibTeX

    @inproceedings{DBLP:conf/otm/Petasis13,
    	author = "Georgios Petasis",
    	title = "Structuring the Blogosphere on News from Traditional Media",
    	booktitle = "On the Move to Meaningful Internet Systems: OTM 2013 Workshops - Confederated International Workshops: OTM Academy, OTM Industry Case Studies Program, ACM, EI2N, ISDE, META4eS, ORM, SeDeS, SINCOM, SMS, and SOMOCO 2013",
    	address = "Graz, Austria",
    	month = "September 9--13",
    	year = 2013,
    	pages = "608-617",
    	ee = "http://dx.doi.org/10.1007/978-3-642-41033-8_77",
    	bibsource = "DBLP, http://dblp.uni-trier.de} editor = {Yan Tang Demey and Herv{\'e} Panetto",
    	publisher = "Springer",
    	series = "Lecture Notes in Computer Science",
    	volume = 8186,
    	isbn = "978-3-642-41032-1"
    }
    
  3. Georgios Petasis, Ralf Möller and Vangelis Karkaletsis.
    BOEMIE: Reasoning-based Information Extraction.
    In Proceedings of the 1st Workshop on Natural Language Processing and Automated Reasoning co-located with 12th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR 2013) 1044. 2013, 60-75.
    BibTeX

    @inproceedings{DBLP:conf/lpnmr/PetasisMK13,
    	author = {Georgios Petasis and Ralf M{\"o}ller and Vangelis Karkaletsis},
    	abstract = "This paper presents a novel approach for exploiting an ontology in an ontology-based information extraction system, which substitutes part of the extraction process with reasoning, guided by a set of automatically acquired rules.",
    	title = "BOEMIE: Reasoning-based Information Extraction",
    	address = "A Corunna, Spain",
    	month = "September 15",
    	year = 2013,
    	pages = "60-75",
    	ee = "http://ceur-ws.org/Vol-1044/paper-06.pdf",
    	bibsource = {DBLP, http://dblp.uni-trier.de} editor = {Chitta Baral and Peter Sch{\"u}ller},
    	booktitle = "Proceedings of the 1st Workshop on Natural Language Processing and Automated Reasoning co-located with 12th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR 2013)",
    	publisher = "CEUR-WS.org",
    	series = "CEUR Workshop Proceedings",
    	volume = 1044
    }
    

Year: 2012

  1. Georgios Petasis and Mara Tsoumari.
    A New Annotation Tool for Aligned Bilingual Corpora.
    In Petr Sojka, Aleš Horák, Ivan Kopeček and Karel Pala (eds.). Text, Speech and Dialogue. Series Lecture Notes in Computer Science, volume 7499, Springer Berlin Heidelberg, 2012, pages 95-104.
    URL, DOI BibTeX

    @incollection{tsd-2012-Petasis-Tsoumari,
    	author = "Petasis, Georgios and Tsoumari, Mara",
    	abstract = "This paper presents a new annotation tool for aligned bilingual corpora, which allows the annotation of a wide range of information, ranging from information about words (such as part-of-speech tags or named-entities) to quite complex annotation schemas involving links between aligned segments, such as co-reference or translation equivalence between aligned segments in the two languages. The annotation tool is implemented as a component of the Ellogon language engineering platform, exploiting its extensive annotation engine, its cross-platform abilities and its linguistic processing components, if such a need arises. The new annotation tool is distributed with an open source license (LGPL), as part of the Ellogon language engineering platform.",
    	address = "Brno, Czech Republic",
    	booktitle = "Text, Speech and Dialogue",
    	booksubtitle = "15th International Conference, TSD 2012, Brno, Czech Republic, September 3--7, 2012. Proceedings",
    	keywords = "Annotation tools; collaborative annotation; adaptable annotation schemas",
    	month = "September 3--7",
    	title = "{A} {N}ew {A}nnotation {T}ool for {A}ligned {B}ilingual {C}orpora",
    	url = "http://www.ellogon.org/petasis/bibliography/TSD2012/tsd450.pdf",
    	year = 2012,
    	pages = "95-104",
    	isbn = "978-3-642-32789-6",
    	volume = 7499,
    	series = "Lecture Notes in Computer Science",
    	editor = "Sojka, Petr and Hor\'{a}k, Ale\v{s} and Kope\v{c}ek, Ivan and Pala, Karel",
    	doi = "10.1007/978-3-642-32790-2_11",
    	publisher = "Springer Berlin Heidelberg"
    }
    
  2. Georgios Petasis.
    The SYNC3 Collaborative Annotation Tool.
    In Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012. May 2012, 363–370.
    URL BibTeX

    @inproceedings{lrec-2012-Petasis,
    	author = "Georgios Petasis",
    	abstract = "The huge amount of the available information in the Web creates the need for effective information extraction systems that are able to produce metadata that satisfy user's information needs. The development of such systems, in the majority of cases, depends on the availability of an appropriately annotated corpus in order to learn or evaluate extraction models. The production of such corpora can be significantly facilitated by annotation tools, which provide user-friendly facilities and enable annotators to annotate documents according to a predefined annotation schema. However, the construction of annotation tools that operate in a distributed environment is a challenging task: the majority of these tools are implemented as Web applications, having to cope with the capabilities offered by browsers. This paper describes the SYNC3 collaborative annotation tool, which implements an alternative architecture: it remains a desktop application, fully exploiting the advantages of desktop applications, but provides collaborative annotation through the use of a centralised server for storing both the documents and their metadata, and instance messaging protocols for communicating events among all annotators. The annotation tool is implemented as a component of the Ellogon language engineering platform, exploiting its extensive annotation engine, its cross-platform abilities and its linguistic processing components, if such a need arises. Finally, the SYNC3 annotation tool is distributed with an open source license, as part of the Ellogon platform.",
    	address = "Istanbul, Turkey",
    	booktitle = "Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012",
    	keywords = "annotation tools, collaborative annotation, adaptable annotation schemas",
    	month = "May",
    	pages = "363--370",
    	publisher = "European Language Resources Association",
    	title = "{T}he {SYNC}3 {C}ollaborative {A}nnotation {T}ool",
    	url = "http://www.ellogon.org/petasis/bibliography/LREC2012/LREC2012-700.pdf",
    	year = 2012
    }
    
  3. Elias Iosif, Georgios Petasis and Vangelis Karkaletsis.
    Ontology-Based Information Extraction under a Bootstrapping Approach.
    In Armando Stellato Maria Teresa Pazienza (ed.). Semi-Automatic Ontology Development: Processes and Resources. IGI Global, April 2012, pages 1–21.
    URL, DOI BibTeX

    @incollection{Iosif.IGIGLOBAL.2012,
    	author = "Elias Iosif and Georgios Petasis and Vangelis Karkaletsis",
    	abstract = "The authors present an ontology-based information extraction process, which operates in a bootstrapping framework. The novelty of this approach lies in the continuous semantics extraction from textual content in order to evolve the underlying ontology, while the evolved ontology enhances in turn the information extraction mechanism. This process was implemented in the context of the R{\&}D project BOEMIE. The BOEMIE system was evaluated on the athletics domain.",
    	editor = "Maria Teresa Pazienza, Armando Stellato",
    	booktitle = "Semi-Automatic Ontology Development: Processes and Resources",
    	address = "Hershey, PA, USA",
    	chapter = 1,
    	doi = "10.4018/978-1-4666-0188-8.ch001",
    	isbn = 9781466601888,
    	month = "April",
    	pages = "1--21",
    	publisher = "IGI Global",
    	title = "{O}ntology-{B}ased {I}nformation {E}xtraction under a {B}ootstrapping {A}pproach",
    	url = "http://www.igi-global.com/chapter/ontology-based-information-extraction-under/63896",
    	year = 2012
    }
    

Year: 2011

  1. Nikos Sarris, Gerasimos Potamianos, Jean-Michel Renders, Claire Grover, Eric Karstens, Leonidas Kallipolitis, Vasilis Tountopoulos, Georgios Petasis, Anastasia Krithara, Matthias Gallé, Guillaume Jacquet, Beatrice Alex, Richard Tobin and Liliana Bounegru.
    A System for Synergistically Structuring News Content from Traditional Media and the Blogosphere.
    In Paul Cunningham and Miriam Cunningham (eds.). eChallenges e-2011 Conference Proceedings. 2011.
    URL BibTeX

    @inproceedings{eChallenges-2011-Sarris,
    	author = "Nikos Sarris and Gerasimos Potamianos and Jean-Michel Renders and Claire Grover and Eric Karstens and Leonidas Kallipolitis and Vasilis Tountopoulos and Petasis, Georgios and Anastasia Krithara and Matthias Gall{\'e} and Guillaume Jacquet and Beatrice Alex and Richard Tobin and Liliana Bounegru",
    	abstract = {News and social media are emerging as a dominant source of information for numerous applications. However, their vast unstructured content present challenges to efficient extraction of such information. In this paper, we present the SYNC3 system that aims to intelligently structure content from both traditional news media and the blogosphere. To achieve this goal, SYNC3 incorporates innovative algorithms that first model news media content statistically, based on fine clustering of articles into so-called {"}news events{"}. Such models are then adapted and applied to the blogosphere domain, allowing its content to map to the traditional news domain. Furthermore, appropriate algorithms are employed to extract news event labels and relations between events, in order to efficiently present news content to the system end users.},
    	booktitle = "eChallenges e-2011 Conference Proceedings",
    	editor = "Paul Cunningham and Miriam Cunningham",
    	organization = "IIMC International Information Management Corporation",
    	address = "Florence, Italy",
    	month = "October 26--28",
    	title = "{A} {S}ystem for {S}ynergistically {S}tructuring {N}ews {C}ontent from {T}raditional {M}edia and the {B}logosphere",
    	year = 2011,
    	url = "http://www.ellogon.org/petasis/bibliography/eChallenges2011/echallenges_ref_81_doc_7322.pdf",
    	isbn = "978-1-905824-27-4"
    }
    
  2. Mara Tsoumari and Georgios Petasis.
    Coreference Annotator - A new annotation tool for aligned bilingual corpora.
    In Proceedings of the Second Workshop on Annotation and Exploitation of Parallel Corpora (AEPC 2), in 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011). 2011, 43–52.
    URL BibTeX

    @inproceedings{AEPC2-RANLP-2011-Tsoumari,
    	author = "Tsoumari, Mara and Petasis, Georgios",
    	abstract = "This paper presents the main features of an annotation tool, the Coreference Annotator, which manages bilingual corpora consisting of aligned texts that can be grouped in collections and subcollections according to their topics and discourse. The tool allows the manual annotation of certain linguistic items in the source text and their translation equivalent in the target text, by entering useful information about these items based on their context.",
    	booktitle = "Proceedings of the Second Workshop on Annotation and Exploitation of Parallel Corpora (AEPC 2), in 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011)",
    	month = "September 15",
    	pages = "43--52",
    	title = "{C}oreference {A}nnotator - {A} new annotation tool for aligned bilingual corpora",
    	year = 2011,
    	url = "http://www.aclweb.org/anthology/W11-4307"
    }
    
  3. Georgios Petasis.
    Unsupervised Domain Adaptation based on Text Relatedness.
    In Proceedings of the International Conference Recent Advances in Natural Language Processing 2011. 2011, 733–739.
    URL BibTeX

    @inproceedings{petasis:2011:RANLP,
    	author = "Petasis, Georgios",
    	title = "Unsupervised Domain Adaptation based on Text Relatedness",
    	abstract = "In this paper an unsupervised approach to do-main adaptation is presented, which exploits external knowledge sources in order to port a classification model into a new thematic do-main. Our approach extracts a new feature set from documents of the target domain, and tries to align the new features to the original ones, by exploiting text relatedness from external knowledge sources, such as WordNet. The approach has been evaluated on the task of document classification, involving the classification of newsgroup postings into 20 news groups.",
    	booktitle = "Proceedings of the International Conference Recent Advances in Natural Language Processing 2011",
    	month = "September 12--14",
    	year = 2011,
    	address = "Hissar, Bulgaria",
    	publisher = "RANLP 2011 Organising Committee",
    	pages = "733--739",
    	url = "http://aclweb.org/anthology/R11-1107"
    }
    
  4. Georgios Petasis.
    Machine Learning in Natural Language Processing.
    Ph.D. Thesis, Department of Informatics and Telecommunications, University of Athens, 2011.
    URL BibTeX

    @phdthesis{PhD-2011-Petasis,
    	author = "Petasis, Georgios",
    	abstract = "This thesis examines the use of machine learning techniques in various tasks of natural language processing, mainly for the task of information extraction from texts. The objectives are the improvement of adaptability of information extraction systems to new thematic domains (or even languages), and the improvement of their performance using as fewer resources (either linguistic or human) as possible. This thesis has examined two main axes: a) the research and assessment of existing algorithms of machine learning mainly in the stages of linguistic pre-processing (such as part of speech tagging) and named-entity recognition, and b) the creation of a new machine learning algorithm and its assessment on synthetic data, as well as in real world data from the task of relation extraction between named entities. This new algorithm belongs to the category of inductive grammar learning, and can infer context free grammars from positive examples only.",
    	keywords = "information extraction, machine learning, grammatical inference",
    	month = "July 1",
    	school = "Department of Informatics and Telecommunications, University of Athens",
    	title = "{M}achine {L}earning in {N}atural {L}anguage {P}rocessing",
    	type = "Ph.D. Thesis",
    	url = "http://www.ellogon.org/petasis/bibliography/Petasis/Ph.D.Thesis-GeorgiosPetasis.pdf",
    	year = 2011
    }
    

Year: 2010

  1. Georgios Petasis.
    TkDND: a cross-platform drag'n'drop package.
    In Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010). 2010.
    URL BibTeX

    @inproceedings{Tcl-2010-TkDND,
    	author = "Petasis, Georgios",
    	abstract = "This paper is about TkDND, a Tcl/Tk extension that aims to add cross-application drag and drop support to Tk, for popular operating systems, such as Microsoft Windows, Apple OS X and GNU/Linux. Being in its second rewrite, TkDND 2.x has a stable implementation for Windows and OS X, while support for Linux and the XDND protocol is still under development.",
    	address = "Hilton Suites Chicago/Oakbrook Terrace, 10 Drury Lane, Oakbrook Terrace, Illinois, United States 60181",
    	booktitle = "Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010)",
    	month = "October 11--15",
    	title = "{T}k{DND}: a cross-platform drag'n'drop package",
    	url = "http://www.ellogon.org/petasis/bibliography/Tcl2010/TkDND.pdf",
    	year = 2010
    }
    
  2. Georgios Petasis.
    Ellogon and the challenge of threads.
    In Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010). 2010.
    URL BibTeX

    @inproceedings{Tcl-2010-Ellogon,
    	author = "Petasis, Georgios",
    	abstract = "This paper is about the Ellogon language engineering platform, and the challenges faced in modernising it, in order to better exploit contemporary hardware. Ellogon is an open-source infrastructure, specialised in natural language processing. Following a data model that closely resembles TIPSTER, Ellogon can be used either as an autonomous application, offering a graphical user interface, or it can be embedded in a C/C++ application as a library. Ellogon has been implemented in C/C++ and Tcl/Tk: in fact Ellogon is a vanilla Tcl interpreter, with the Ellogon core loaded as a Tcl extension, and a set of Tcl/Tk scripts that implement the GUI. The core component of Ellogon, being a Tcl extension, heavily relies on Tcl objects to implement its data model, a decision made more than a decade ago, which poses difficulties into making Ellogon a multi-threaded application.",
    	address = "Hilton Suites Chicago/Oakbrook Terrace, 10 Drury Lane, Oakbrook Terrace, Illinois, United States 60181",
    	booktitle = "Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010)",
    	month = "October 11--15",
    	title = "{E}llogon and the challenge of threads",
    	url = "http://www.ellogon.org/petasis/bibliography/Tcl2010/EllogonAndThreads.pdf",
    	year = 2010
    }
    
  3. Georgios Petasis.
    TileQt and TileGtk: current status.
    In Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010). 2010.
    URL BibTeX

    @inproceedings{Tcl-2010-TileQtTileGTK,
    	author = "Petasis, Georgios",
    	abstract = "This paper is about two Tile and Ttk themes, TileQt and TileGTK. Despite being two distinct and very different extensions, the motivation for their development was common: making Tk applications look as native as possible under the Linux operating system.",
    	address = "Hilton Suites Chicago/Oakbrook Terrace, 10 Drury Lane, Oakbrook Terrace, Illinois, United States 60181",
    	booktitle = "Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010)",
    	month = "October 11--15",
    	title = "{T}ile{Q}t and {T}ile{G}tk: current status",
    	url = "http://www.ellogon.org/petasis/bibliography/Tcl2010/TileQtAndTileGTK.pdf",
    	year = 2010
    }
    
  4. Georgios Petasis.
    TkGecko: Another Attempt for an HTML Renderer for Tk.
    In Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010). 2010.
    URL BibTeX

    @inproceedings{Tcl-2010-TkGecko,
    	author = "Petasis, Georgios",
    	abstract = "The support for displaying HTML and especially complex Web sites has always been problematic in Tk. Several efforts have been made in order to alleviate this problem, and this paper presents another (and still incomplete) one. This paper presents TkGecko, a Tcl/Tk extension written in C++, which allows Gecko (the HTML processing and rendering engine developed by the Mozilla Foundation) to be embedded as a widget in Tk. The current status of the TkGecko extension is alpha quality, while the code is publically available under the BSD license.",
    	address = "Hilton Suites Chicago/Oakbrook Terrace, 10 Drury Lane, Oakbrook Terrace, Illinois, United States 60181",
    	booktitle = "Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010)",
    	month = "October 11--15",
    	title = "{T}k{G}ecko: {A}nother {A}ttempt for an {HTML} {R}enderer for {T}k",
    	url = "http://www.ellogon.org/petasis/bibliography/Tcl2010/TkGecko.pdf",
    	year = 2010
    }
    
  5. Georgios Petasis.
    TkRibbon: Windows Ribbons for Tk.
    In Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010). 2010.
    URL BibTeX

    @inproceedings{Tcl2010-TkRibbon,
    	author = "Petasis, Georgios",
    	abstract = "This paper is about TkRibbon, a Tcl/Tk extension that aims to introduce support for the Windows Ribbon Framework in the Tk toolkit. The Windows Ribbon is a graphical interface where a set of toolbars are placed on tabs in a notebook widget, aiming to substitute traditional menus and toolbars. This paper briefly describes Windows Ribbon framework, the TkRibbon Tk extension and presents some examples on how TkRibbon can be used by Tk applications.",
    	address = "Hilton Suites Chicago/Oakbrook Terrace, 10 Drury Lane, Oakbrook Terrace, Illinois, United States 60181",
    	booktitle = "Proceedings of the 17th Annual Tcl/Tk Conference (Tcl 2010)",
    	month = "October 11--15",
    	title = "{T}k{R}ibbon: {W}indows {R}ibbons for {T}k",
    	url = "http://www.ellogon.org/petasis/bibliography/Tcl2010/TkRibbon.pdf",
    	year = 2010
    }
    
  6. Georgios Petasis and Dimitrios Petasis.
    BlogBuster: A Tool for Extracting Corpora from the Blogosphere.
    In Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner and Daniel Tapias (eds.). Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010. 2010.
    URL BibTeX

    @inproceedings{DBLP:conf/lrec/PetasisP10,
    	author = "Petasis, Georgios and Dimitrios Petasis",
    	abstract = "This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, suitable for linguistic and language technology research and development, has attracted significant research interest recently. Several general purpose approaches for removing boilerplate have been presented in the literature; however the blogosphere poses additional requirements, such as a finer control over the extracted textual segments in order to accurately identify important elements, i.e. individual blog posts, titles, posting dates or comments. BlogBuster tries to provide such additional details along with boilerplate removal, following a rule-based approach. A small set of rules were manually constructed by observing a limited set of blogs from the Blogger and Wordpress hosting platforms. These rules operate on the DOM tree of an HTML page, as constructed by a popular browser, Mozilla Firefox. Evaluation results suggest that BlogBuster is very accurate when extracting corpora from blogs hosted in the Blogger and Wordpress, while exhibiting a reasonable precision when applied to blogs not hosted in these two popular blogging platforms.",
    	address = "Valletta, Malta",
    	booktitle = "Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010",
    	editor = "Nicoletta Calzolari and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias",
    	isbn = "2-9517408-6-7",
    	month = "May 17--23",
    	publisher = "European Language Resources Association",
    	title = "{B}log{B}uster: {A} {T}ool for {E}xtracting {C}orpora from the {B}logosphere",
    	url = "http://www.ellogon.org/petasis/bibliography/LREC2010/LREC2010-BlogBuster-CameraReady.pdf",
    	year = 2010
    }
    

Year: 2009

  1. Georgios Petasis, Vangelis Karkaletsis, Anastasia Krithara, Georgios Paliouras and Constantine D Spyropoulos.
    Semi-automated ontology learning: the BOEMIE approach.
    In Proceedings of the First ESWC Workshop on Inductive Reasoning and Machine Learning on the Semantic Web (IRMLeS 2009), 6th European Semantic Web Conference (ESWC 2009). 2009.
    URL BibTeX

    @inproceedings{citeulike:9267249,
    	author = "Petasis, Georgios and Vangelis Karkaletsis and Anastasia Krithara and Georgios Paliouras and Constantine D. Spyropoulos",
    	abstract = "In this paper we describe a semi-automated approach for ontology learning. Exploiting an ontology-based multimodal information extraction system, the ontology learning subsystem accumulates documents that are insufficiently analysed and through clustering proposes new concepts, relations and interpretation rules to be added to the ontology.",
    	address = "Hersonissos, Crete, Greece",
    	booktitle = "Proceedings of the First ESWC Workshop on Inductive Reasoning and Machine Learning on the Semantic Web (IRMLeS 2009), 6th European Semantic Web Conference (ESWC 2009)",
    	day = 1,
    	keywords = "evolution, ontologies",
    	month = "June 1",
    	title = "{S}emi-automated ontology learning: the {BOEMIE} approach",
    	url = "http://www.ellogon.org/petasis/bibliography/ESWC2009/IRMLeS2009-ESWC2009.pdf",
    	year = 2009
    }
    
  2. Silvana Castano, Irma Sofia Espinosa Peraldi, Alfio Ferrara, Vangelis Karkaletsis, Atila Kaya, Ralf Möller, Stefano Montanelli, Georgios Petasis and Michael Wessel.
    Multimedia Interpretation for Dynamic Ontology Evolution.
    Journal of Logic and Computation 19(5):859–897, 2009.
    URL, DOI BibTeX

    @article{Castano01102009,
    	author = {Silvana Castano and Irma Sofia Espinosa Peraldi and Alfio Ferrara and Karkaletsis, Vangelis and Atila Kaya and Ralf M{\"o}ller and Stefano Montanelli and Petasis, Georgios and Michael Wessel},
    	abstract = "The recent success of distributed and dynamic infrastructures for knowledge sharing has raised the need for semiautomatic/automatic ontology evolution strategies. Ontology evolution is generally defined as the timely adaptation of an ontology to changing requirements and the consistent propagation of changes to dependent artifacts. In this article, we present an ontology evolution approach in the context of multimedia interpretation. Ontology evolution in this context relies on the results obtained through reasoning for the interpretation of multimedia resources, through population of the ontology with new individuals or through enrichment of the ontology with new concepts and new semantic relations. The article analyses the results of interpretation, population and enrichment obtained in evaluation experiments in terms of measures such as precision and recall. The evaluation reveals encouraging results.",
    	doi = "10.1093/logcom/exn049",
    	journal = "Journal of Logic and Computation",
    	number = 5,
    	pages = "859--897",
    	title = "{M}ultimedia {I}nterpretation for {D}ynamic {O}ntology {E}volution",
    	url = "http://logcom.oxfordjournals.org/content/19/5/859.abstract",
    	volume = 19,
    	year = 2009,
    	eprint = "http://logcom.oxfordjournals.org/content/19/5/859.full.pdf+html"
    }
    

Year: 2008

  1. Georgios Petasis, Pavlina Fragkou, Aris Theodorakos, Vangelis Karkaletsis and Constantine D Spyropoulos.
    Segmenting HTML pages using visual and semantic information.
    In Proceedings of the 4th Web as a Corpus Workshop (WAC-4), 6th Language Resources and Evaluation Conference (LREC 2008). 2008, 18–24.
    Proceedings: The 4th Web as Corpus: Can we do better than Google? http://www.lrec-conf.org/proceedings/lrec2008/workshops/W19_Proceedings.pdf.
    URL, DOI BibTeX

    @inproceedings{citeulike:5663452,
    	author = "Petasis, Georgios and Pavlina Fragkou and Aris Theodorakos and Vangelis Karkaletsis and Constantine D. Spyropoulos",
    	abstract = "The information explosion of the Web aggravates the problem of effective information retrieval. Even though linguistic approaches found in the literature perform linguistic annotation by creating metadata in the form of tokens, lemmas or part of speech tags, however,this process is insufficient. This is due to the fact that these linguistic metadata do not exploit the actual content of the page, leading to the need of performing semantic annotation based on a predefined semantic model. This paper proposes a new learning approach for performing automatic semantic annotation. This is the result of a two step procedure: the first step partitions a web page into blocks based on its visual layout, while the second, performs subsequent partitioning based on the examination of appearance of specific types of entities denoting the semantic category as well as the application of a number of simple heuristics. Preliminary experiments performed on a manually annotated corpus regarding athletics proved to be very promising.",
    	address = "Marrakech, Morocco",
    	booktitle = "Proceedings of the 4th Web as a Corpus Workshop (WAC-4), 6th Language Resources and Evaluation Conference (LREC 2008)",
    	doi = "10.1109/SPCA.2006.297506",
    	journal = "4th Web as Corpus Workshop (WAC-4)",
    	month = "June 1",
    	note = "Proceedings: The 4th Web as Corpus: Can we do better than Google? http://www.lrec-conf.org/proceedings/lrec2008/workshops/W19_Proceedings.pdf",
    	pages = "18--24",
    	title = "{S}egmenting {HTML} pages using visual and semantic information",
    	url = "http://www.ellogon.org/petasis/bibliography/LREC2008/LREC-2008-SemanticSegmentation-Submitted.pdf",
    	year = 2008
    }
    
  2. Pavlina Fragkou, Georgios Petasis, Aris Theodorakos, Vangelis Karkaletsis and Constantine D Spyropoulos.
    BOEMIE Ontology-Based Text Annotation Tool.
    In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008). 2008.
    URL BibTeX

    @inproceedings{DBLP:conf/lrec/FragkouPTKS08,
    	author = "Pavlina Fragkou and Petasis, Georgios and Aris Theodorakos and Karkaletsis, Vangelis and Constantine D. Spyropoulos",
    	abstract = "The huge amount of the available information in the Web creates the need of effective information extraction systems that are able to produce metadata that satisfy user’s information needs. The development of such systems, in the majority of cases, depends on the availability of an appropriately annotated corpus in order to learn extraction models. The production of such corpora can be significantly facilitated by annotation tools that are able to annotate, according to a defined ontology, not only named entities but most importantly relations between them. This paper describes the BOEMIE ontology-based annotation tool which is able to locate blocks of text that correspond to specific types of named entities, fill tables corresponding to ontology concepts with those named entities and link the filled tables based on relations defined in the domain ontology. Additionally, it can perform annotation of blocks of text that refer to the same topic. The tool has a user-friendly interface, supports automatic pre-annotation, annotation comparison as well as customization to other annotation schemata. The annotation tool has been used in a large scale annotation task involving 3000 web pages regarding athletics. It has also been used in another annotation task involving 503 web pages with medical information, in different languages.",
    	address = "Marrakech, Morocco",
    	booktitle = "Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008)",
    	month = "May 26 -- June 1",
    	publisher = "European Language Resources Association",
    	title = "{BOEMIE} {O}ntology-{B}ased {T}ext {A}nnotation {T}ool",
    	url = "http://www.ellogon.org/petasis/bibliography/LREC2008/LREC-2008-324_paper.pdf",
    	year = 2008
    }
    
  3. Georgios Petasis, Vangelis Karkaletsis, Georgios Paliouras and Constantine D Spyropoulos.
    Learning context-free grammars to extract relations from text.
    In Malik Ghallab, Constantine D Spyropoulos, Nikos Fakotakis and Nikolaos M Avouris (eds.). Proceeding of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence 178. 2008, 303–307.
    URL BibTeX

    @inproceedings{Petasis:2008:LCG:1567281.1567350,
    	author = "Petasis, Georgios and Karkaletsis, Vangelis and Georgios Paliouras and Constantine D. Spyropoulos",
    	abstract = "In this paper we propose a novel relation extraction method, based on grammatical inference. Following a semi-supervised learning approach, the text that connects named entities in an annotated corpus is used to infer a context free grammar. The grammar learning algorithm is able to infer grammars from positive examples only, controlling overgeneralisation through minimum description length. Evaluation results show that the proposed approach performs comparable to the state of the art, while exhibiting a bias towards precision, which is a sign of conservative generalisation.",
    	address = "Amsterdam, The Netherlands, The Netherlands",
    	booktitle = "Proceeding of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence",
    	editor = "Malik Ghallab and Constantine D. Spyropoulos and Nikos Fakotakis and Nikolaos M. Avouris",
    	isbn = "978-1-58603-891-5",
    	pages = "303--307",
    	publisher = "IOS Press",
    	series = "Frontiers in Artificial Intelligence and Applications",
    	title = "{L}earning context-free grammars to extract relations from text",
    	url = "http://www.ellogon.org/petasis/bibliography/ECAI2008/ECAI2008_0371.pdf",
    	volume = 178,
    	year = 2008
    }
    

Year: 2007

  1. Silvana Castano, Sofia Espinosa, Alfio Ferrara, Vangelis Karkaletsis, Atila Kaya, Sylvia Melzer, Ralf M "o, Stefano Montanelli and Georgios Petasis.
    Ontology Dynamics with Multimedia Information: The BOEMIE Evolution Methodology.
    In Proceedings of the International ESWC Workshop on Ontology Dynamics (IWOD 2007). 2007.
    http://kmi.open.ac.uk/events/iwod/.
    URL BibTeX

    @inproceedings{IWOD07,
    	author = {Silvana Castano and Sofia Espinosa and Alfio Ferrara and Karkaletsis, Vangelis and Atila Kaya and Sylvia Melzer and Ralf M{\{"}o}ller and Stefano Montanelli and Petasis, Georgios},
    	abstract = "In this paper, we present the ontology evolution methodology developed in the context of the BOEMIE project. Ontology evolution in BOEMIE relies on the results obtained through reasoning for the interpretation of multimedia resources in order to evolve (enhance) the ontology, through population of the ontology with new instances, or through enrichment of the ontology with new concepts and new semantic relations.",
    	address = "Innsbruck, Austria",
    	booktitle = "Proceedings of the International ESWC Workshop on Ontology Dynamics (IWOD 2007)",
    	month = "June 7",
    	note = "http://kmi.open.ac.uk/events/iwod/",
    	title = "{O}ntology {D}ynamics with {M}ultimedia {I}nformation: {T}he {BOEMIE} {E}volution {M}ethodology",
    	url = "http://www.ellogon.org/petasis/bibliography/IWOD2007/IWOD2007-paper-07.pdf",
    	year = 2007
    }
    

Year: 2005

  1. Dimitris Spiliotopoulos, Georgios Petasis and Georgios Kouroupetroglou.
    Prosodically Enriched Text Annotation for High Quality Speech Synthesis.
    In Proceedings of the 10th International Conference on Speech and Computer (SPECOM-2005). 2005, 313–316.
    URL BibTeX

    @inproceedings{SPECOM-2005-Spiliotopoulos,
    	author = "Dimitris Spiliotopoulos and Petasis, Georgios and Georgios Kouroupetroglou",
    	abstract = "Linguistically enriched text generated from natural language modules contributes significantly on the quality of speech synthesis. For all cases where such modules are not available, such enriched input needs to be produced from plain text in order to maintain quality. This work reports on a framework of several combined language resources and procedures (word/sentence identification, syntactic analysis, prosodic feature annotation) for text annotation/processing from plain text. Using that, the implementation of an automatic XML formatted output generation module produces the prosodically enriched markup.",
    	address = "Patras, Greece",
    	booktitle = "Proceedings of the 10th International Conference on Speech and Computer (SPECOM-2005)",
    	month = "October 17--19",
    	pages = "313--316",
    	title = "{P}rosodically {E}nriched {T}ext {A}nnotation for {H}igh {Q}uality {S}peech {S}ynthesis",
    	url = "http://www.ellogon.org/petasis/bibliography/SPECOM2005/Spiliotopoulos-SPECOM-2005.pdf",
    	year = 2005
    }
    

Year: 2004

  1. Georgios Petasis, Georgios Paliouras, Constantine D Spyropoulos and Constantine Halatsis.
    Eg-GRIDS: Context-Free Grammatical Inference from Positive Examples Using Genetic Search.
    In Georgios Paliouras and Yasubumi Sakakibara (eds.). Grammatical Inference: Algorithms and Applications, Proceedings of the 7th International Colloquium on Grammatical Inference (ICGI 2004) 3264. 2004, 223–234.
    URL BibTeX

    @inproceedings{DBLP:conf/icgi/PetasisPSH04,
    	author = "Petasis, Georgios and Georgios Paliouras and Constantine D. Spyropoulos and Constantine Halatsis",
    	abstract = "In this paper we present eg-GRIDS, an algorithm for inducing context-free grammars that is able to learn from positive sample sentences. The presented algorithm, similar to its GRIDS predecessors, uses simplicity as a criterion for directing inference, and a set of operators for exploring the search space. In addition to the basic beam search strategy of GRIDS, eg-GRIDS incorporates an evolutionary grammar selection process, aiming to explore a larger part of the search space. Evaluation results are presented on artificially generated data, comparing the performance of beam search and genetic search. These results show that genetic search performs better than beam search while being significantly more efficient computationally.",
    	address = "Athens, Greece",
    	booktitle = "Grammatical Inference: Algorithms and Applications, Proceedings of the 7th International Colloquium on Grammatical Inference (ICGI 2004)",
    	editor = "Georgios Paliouras and Yasubumi Sakakibara",
    	isbn = "3-540-23410-1",
    	month = "October 11--13",
    	pages = "223--234",
    	publisher = "Springer Berlin / Heidelberg",
    	series = "Lecture Notes in Computer Science",
    	title = "{E}g-{GRIDS}: {C}ontext-{F}ree {G}rammatical {I}nference from {P}ositive {E}xamples {U}sing {G}enetic {S}earch",
    	url = "http://www.ellogon.org/petasis/bibliography/ICGI2004/e-GRIDS-ICGI-2004-Submission.pdf",
    	volume = 3264,
    	year = 2004
    }
    
  2. Georgios Petasis, Vangelis Karkaletsis, Claire Grover, Ben Hachey, Maria Teresa Pazienza, Michele Vindigni and José Coch.
    Adaptive, Multilingual Named Entity Recognition in Web Pages.
    In Ramon López Mántaras and Lorenza Saitta (eds.). Proceedings of the 16th Eureopean Conference on Artificial Intelligence (ECAI'2004), including Prestigious Applicants of Intelligent Systems (PAIS 2004). 2004, 1073–1074.
    Extended version: http://www.ellogon.org/petasis/bibliography/ECAI2004/ECAI2004_NERC.pdf.
    URL BibTeX

    @inproceedings{DBLP:conf/ecai/PetasisKGHPVC04,
    	author = "Petasis, Georgios and Karkaletsis, Vangelis and Claire Grover and Ben Hachey and Maria Teresa Pazienza and Michele Vindigni and Jos{\'e} Coch",
    	abstract = "Most of the information on the Web today is in the form of HTML documents, which are designed for presentation purposes and not for machine understanding and reasoning. Existing web extraction systems require a lot of human involvement for maintenance due to changes to targeted web sites and for adaptation to new web sites or even to new domains. This paper presents the adaptive, multilingual named entity recognition and classification (NERC) technologies developed for processing web pages in the context of the R{\&}D project CROSSMARC. The evaluation results demonstrate the viability of our approach.",
    	address = "Valencia, Spain",
    	booktitle = "Proceedings of the 16th Eureopean Conference on Artificial Intelligence (ECAI'2004), including Prestigious Applicants of Intelligent Systems (PAIS 2004)",
    	crossref = "DBLP:conf/ecai/2004",
    	editor = "Ramon L{\'o}pez de M{\'a}ntaras and Lorenza Saitta",
    	isbn = "1-58603-452-9",
    	month = "August 22--27",
    	note = "Extended version: http://www.ellogon.org/petasis/bibliography/ECAI2004/ECAI2004_NERC.pdf",
    	pages = "1073--1074",
    	publisher = "IOS Press",
    	title = "{A}daptive, {M}ultilingual {N}amed {E}ntity {R}ecognition in {W}eb {P}ages",
    	url = "http://www.ellogon.org/petasis/bibliography/ECAI2004/Petasis-ECAI2004-Poster.pdf",
    	year = 2004
    }
    
  3. Stavros J Perantonis, Basilios Gatos, Vassilios Maragos, Vangelis Karkaletsis and Georgios Petasis.
    Text Area Identification in Web Images.
    In George A Vouros and Themis Panayiotopoulos (eds.). Methods and Applications of Artificial Intelligence, Proceedings of the 3rd Hellenic Conference on Artificial Intelligence (SETN 2004) 3025. May 2004, 82–92.
    URL BibTeX

    @inproceedings{DBLP:conf/setn/PerantonisGMKP04,
    	author = "Stavros J. Perantonis and Basilios Gatos and Vassilios Maragos and Karkaletsis, Vangelis and Petasis, Georgios",
    	abstract = "With the explosive growth of the World Wide Web, millions of documents are published and accessed on-line. Statistics show that a significant part of Web text information is encoded in Web images. Since Web images have special characteristics that sometimes distinguish them from other types of images, commercial OCR products often fail to recognize Web images due to their special characteristics. This paper proposes a novel Web image processing algorithm that aims to locate text areas and prepare them for OCR procedure with better results. Our methodology for text area identification has been fully integrated with an OCR engine and with an Information Extraction system. We present quantitative results for the performance of the OCR engine as well as qualitative results concerning its effects to the Information Extraction system. Experimental results obtained from a large corpus of Web images, demonstrate the efficiency of our methodology.",
    	address = "Samos, Greece",
    	booktitle = "Methods and Applications of Artificial Intelligence, Proceedings of the 3rd Hellenic Conference on Artificial Intelligence (SETN 2004)",
    	editor = "George A. Vouros and Themis Panayiotopoulos",
    	isbn = "3-540-21937-4",
    	month = "May",
    	pages = "82--92",
    	publisher = "Springer Berlin / Heidelberg",
    	series = "Lecture Notes in Computer Science",
    	title = "{T}ext {A}rea {I}dentification in {W}eb {I}mages",
    	url = "http://www.ellogon.org/petasis/bibliography/SETN2004/SETN2004.pdf",
    	volume = 3025,
    	year = 2004
    }
    
  4. Georgios Petasis, Georgios Paliouras, Vangelis Karkaletsis, Constantine Halatsis and Constantine D Spyropoulos.
    E-GRIDS: Computationally Efficient Grammatical Inference from Positive Examples.
    GRAMMARS 7:69–110, 2004.
    Technical Report referenced in the paper: http://www.ellogon.org/petasis/bibliography/GRAMMARS/GRAMMARS2004-SpecialIssue-Petasis-TechnicalReport.pdf.
    URL BibTeX

    @article{GRAMMARS-vol.7-Petasis,
    	author = "Petasis, Georgios and Georgios Paliouras and Vangelis Karkaletsis and Constantine Halatsis and Constantine D. Spyropoulos",
    	abstract = "In this paper we present a new computationally efficient algorithm for inducing context-free grammars that is able to learn from positive sample sentences. This new algorithm uses simplicity as a criterion for directing inference, and the search process of the new algorithm has been optimised by utilising the results of a theoretical analysis regarding the behaviour and complexity of the search operators. Evaluation results are presented on artificially generated data, while the scalability of the algorithm is tested on a large textual corpus. These results show that the new algorithm performs well and can infer grammars from large data sets in a reasonable amount of time.",
    	journal = "GRAMMARS",
    	keywords = "grammatical inference, context-free grammars, minimum description length, positive examples",
    	note = "Technical Report referenced in the paper: http://www.ellogon.org/petasis/bibliography/GRAMMARS/GRAMMARS2004-SpecialIssue-Petasis-TechnicalReport.pdf",
    	pages = "69--110",
    	title = "{E}-{GRIDS}: {C}omputationally {E}fficient {G}rammatical {I}nference from {P}ositive {E}xamples",
    	url = "http://www.ellogon.org/petasis/bibliography/GRAMMARS/GRAMMARS2004.pdf",
    	volume = 7,
    	year = 2004
    }
    

Year: 2003

  1. Georgios Petasis, Vangelis Karkaletsis, Georgios Paliouras and Constantine D Spyropoulos.
    Using the Ellogon Natural Language Engineering Infrastructure.
    In Proceedings of the Workshop on Balkan Language Resources and Tools, 1st Balkan Conference in Informatics (BCI 2003). 2003.
    http://labs-repos.iit.demokritos.gr/skel/bci03_workshop/.
    URL BibTeX

    @inproceedings{BCI2003-Petasis,
    	author = "Petasis, Georgios and Vangelis Karkaletsis and Georgios Paliouras and Constantine D. Spyropoulos",
    	abstract = "Ellogon is a multi-lingual, cross-operating system, general-purpose natural language engineering infrastructure. Ellogon has been used extensively in various NLP applications. It is currently provided for free for research use to research and academic organisations. In this paper, we outline its architecture and data model, present Ellogon features as used by different types of users and discuss its functionalities against other infrastructures for language engineering.",
    	address = "Thessaloniki, Greece",
    	booktitle = "Proceedings of the Workshop on Balkan Language Resources and Tools, 1st Balkan Conference in Informatics (BCI 2003)",
    	month = "November 21",
    	note = "http://labs-repos.iit.demokritos.gr/skel/bci03_workshop/",
    	title = "{U}sing the {E}llogon {N}atural {L}anguage {E}ngineering {I}nfrastructure",
    	url = "http://www.ellogon.org/petasis/bibliography/BCI2003/BCI2003-Petasis.pdf",
    	year = 2003
    }
    
  2. Georgios Petasis, Vangelis Karkaletsis and Constantine D Spyropoulos.
    Cross-lingual Information Extraction from Web pages: the use of a general-purpose Text Engineering Platform.
    In Proceedings of the 4th International Conference on Recent Advances in Natural Language Processing (RANLP 2003). 2003, 381–388.
    http://lml.bas.bg/ranlp2003/.
    URL BibTeX

    @inproceedings{RANLP2003-Petasis,
    	author = "Petasis, Georgios and Vangelis Karkaletsis and Constantine D. Spyropoulos",
    	abstract = {In this paper we present how the use of a general-purpose text engineering platform has facilitated the development of a cross-lingual information extraction system and its adaptation to new domains and languages. Our approach for crosslingual information extraction from the Web covers all the way from the identification of Web sites of interest, to the location of the domain specific Web pages, to the extraction of specific information from the Web pages and its presentation to the end-user. This approach has been implemented in the context of the IST project CROSSMARC. The text engineering platform {"}Ellogon{"} offers functionalities that facilitated the development of core CROSSMARC components as well as their porting into new domains and languages.},
    	address = "Borovets, Bulgaria",
    	booktitle = "Proceedings of the 4th International Conference on Recent Advances in Natural Language Processing (RANLP 2003)",
    	month = "September 10--12",
    	note = "http://lml.bas.bg/ranlp2003/",
    	pages = "381--388",
    	title = "{C}ross-lingual {I}nformation {E}xtraction from {W}eb pages: the use of a general-purpose {T}ext {E}ngineering {P}latform",
    	url = "http://www.ellogon.org/petasis/bibliography/RANLP2003/RANLP-CameraReady.pdf",
    	year = 2003
    }
    

Year: 2002

  1. Dimitra Farmakiotou, Vangelis Karkaletsis, Ioannis Koutsias, Georgios Petasis and Constantine D Spyropoulos.
    PatEdit: An Information Extraction Pattern Editor for Fast System Customization.
    In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002). 2002, 1097–1102.
    URL BibTeX

    @inproceedings{FarmakiotouEtAl02,
    	author = "Dimitra Farmakiotou and Karkaletsis, Vangelis and Ioannis Koutsias and Petasis, Georgios and Constantine D. Spyropoulos",
    	abstract = "This paper addresses the problem of Information Extraction (IE) system customization to new domains and extraction needs with the use of PatEdit, an IE Pattern Editor. PatEdit is a human-assisted knowledge engineering tool, that facilitates the production of IE patterns. First, we present the problem of IE system customisation and the use of human assisted knowledge engineering tools. Then, we describe PatEdit with respect to the IE pattern language used and discuss its characteristics that facilitate rapid pattern writing. Finally, the exploitation of PatEdit in two information extraction projects is presented along with our plans for future work.",
    	address = "Las Palmas, Canary Islands, Spain",
    	booktitle = "Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002)",
    	month = "May 29--31",
    	pages = "1097--1102",
    	publisher = "European Language Resources Association",
    	title = "{P}at{E}dit: {A}n {I}nformation {E}xtraction {P}attern {E}ditor for {F}ast {S}ystem {C}ustomization",
    	url = "http://www.ellogon.org/petasis/bibliography/LREC2002/LREC2002_Farmakiotou.pdf",
    	year = 2002
    }
    
  2. Claire Grover, Scott Mcdonald, Donnla Nic Gearailt, Vangelis Karkaletsis, Dimitra Farmakiotou, Georgios Samaritakis, Georgios Petasis, Maria Teresa Pazienza, Michele Vindigni and Frantz Vichot.
    Multilingual XML-Based Named Entity Recognition for E-Retail Domains.
    In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002). 2002.
    URL BibTeX

    @inproceedings{LREC2002-Grover,
    	author = "Claire Grover and Scott Mcdonald and Donnla Nic Gearailt and Vangelis Karkaletsis and Dimitra Farmakiotou and Georgios Samaritakis and Petasis, Georgios and Maria Teresa Pazienza and Michele Vindigni and Frantz Vichot",
    	abstract = "We describe the multilingual Named Entity Recognition and Classification (NERC) subpart of an e-retail product comparison system which is currently under development as part of the EU-funded project CROSSMARC. The system must be rapidly extensible, both to new languages and new domains. To achieve this aim we use XML as our common exchange format and the monolingual NERC components use a combination of rule-based and machine-learning techniques. It has been challenging to process web pages which contain heavily structured data where text is intermingled with HTML and other code. Our preliminary evaluation results demonstrate the viability of our approach.",
    	address = "Las Palmas, Canary Islands, Spain",
    	booktitle = "Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002)",
    	month = "May 29--31",
    	publisher = "European Language Resources Association",
    	title = "{M}ultilingual {XML}-{B}ased {N}amed {E}ntity {R}ecognition for {E}-{R}etail {D}omains",
    	url = "http://www.ellogon.org/petasis/bibliography/LREC2002/LREC2002_Grover.pdf",
    	year = 2002
    }
    
  3. Georgios Petasis, Vangelis Karkaletsis, Georgios Paliouras, Ion Androutsopoulos and Constantine D Spyropoulos.
    Ellogon: A New Text Engineering Platform.
    In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002). 2002, 72–78.
    URL BibTeX

    @inproceedings{Petasis02ellogon:a,
    	author = "Petasis, Georgios and Karkaletsis, Vangelis and Georgios Paliouras and Ion Androutsopoulos and Constantine D. Spyropoulos",
    	abstract = "This paper presents Ellogon, a multi-lingual, cross-platform, general-purpose text engineering environment. Ellogon was designed in order to aid both researchers in natural language processing, as well as companies that produce language engineering systems for the end-user. Ellogon provides a powerful TIPSTER-based infrastructure for managing, storing and exchanging textual data, embedding and managing text processing components as well as visualising textual data and their associated linguistic information. Among its key features are full Unicode support, an extensive multi-lingual graphical user interface, its modular architecture and the reduced hardware requirements.",
    	address = "Las Palmas, Canary Islands, Spain",
    	booktitle = "Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002)",
    	month = "May 29--31",
    	pages = "72--78",
    	publisher = "European Language Resources Association",
    	title = "{E}llogon: {A} {N}ew {T}ext {E}ngineering {P}latform",
    	url = "http://www.ellogon.org/petasis/bibliography/LREC2002/LREC2002_Petasis.pdf",
    	year = 2002
    }
    
  4. Dimitra Farmakiotou, Vangelis Karkaletsis, Georgios Samaritakis, Georgios Petasis and Constantine D Spyropoulos.
    Named Entity Recognition from Greek Web Pages.
    In Ioannis P Vlahavas and Constantine D Spyropoulos (eds.). Proceedings of the 2nd Hellenic Conference on Artificial Intelligence (SETN-02), Companion Volume. 2002, 91–102.
    http://lpis.csd.auth.gr/setn02/.
    URL BibTeX

    @inproceedings{DBLP:conf/setn/AndroutsopoulosSSDKS02,
    	author = "Dimitra Farmakiotou and Karkaletsis, Vangelis and Georgios Samaritakis and Petasis, Georgios and Constantine D. Spyropoulos",
    	abstract = "We describe the functionalities of the Hellenic Named Entity Recognition and Classification (HNERC) system developed in the context of the CROSSMARC project. CROSSMARC is developing technology for e-retail product comparison. The CROSSMARC system locates relevant retailers’ web pages and processes them in order to extract information about their products (e.g. technical features, prices). CROSSMARC’s technology is demonstrated and evaluated for two different product types and four languages (English, Greek, Italian, French). This paper presents the HNERC system that is responsible for the identification and classification of specific types of proper names (e.g. laptop manufacturers, models), numerical expressions (e.g. length, weight), and temporal expressions (e.g. time, date) in Hellenic vendor sites. The paper presents the HNERC processing stages using examples from the laptops domain.",
    	address = "Thessaloniki, Greece",
    	booktitle = "Proceedings of the 2nd Hellenic Conference on Artificial Intelligence (SETN-02), Companion Volume",
    	editor = "Ioannis P. Vlahavas and Constantine D. Spyropoulos",
    	month = "April 11--12",
    	note = "http://lpis.csd.auth.gr/setn02/",
    	pages = "91--102",
    	title = "{N}amed {E}ntity {R}ecognition from {G}reek {W}eb {P}ages",
    	url = "http://www.ellogon.org/petasis/bibliography/SETN2002/091.pdf",
    	year = 2002
    }
    
  5. Georgios Petasis, Sergios Petridis, Georgios Paliouras, Vangelis Karkaletsis, Stavros J Perantonis and Constantine D Spyropoulos.
    Symbolic and Neural Learning of Named-Entity Recognition and Classification Systems in Two Languages.
    In Hans-J"urgen Zimmermann, Georgios Tselentis, Maarsten Someren and Georgios Dounias (eds.). Advances in Computational Intelligence and Learning: Methods and Applications. Series International Series in Intelligent Technologies, volume 18, Springer Berlin / Heidelberg, January 2002, pages 193–210.
    http://www.springer.com/mathematics/book/978-0-7923-7645-3.
    URL BibTeX

    @incollection{Petasis:2002:SNL:647292.722672,
    	author = "Petasis, Georgios and Sergios Petridis and Georgios Paliouras and Karkaletsis, Vangelis and Stavros J. Perantonis and Constantine D. Spyropoulos",
    	abstract = "This paper compares two alternative approaches to the problem of acquiring named-entity recognition and classification systems from training corpora, in two different languages. The process of named-entity recognition and classification is an important subtask in most language engineering applications, in particular information extraction, where different types of named entity are associated with specific roles in events. The manual construction of rules for the recognition of named entities is a tedious and time-consuming task. For this reason, effective methods to acquire such systems automatically from data are very desirable. In this paper we compare two popular learning methods on this task: a decision-tree induction method and a multi-layered feed-forward neural network. Particular emphasis is paid on the selection of the appropriate data representation for each method and the extraction of training examples from unstructured textual data. We compare the performance of the two methods on large corpora of English and Greek texts and present the results. In addition to the good performance of both methods, one very interesting result is the fact that a simple representation of the data, which ignores the order of the words within a named entity, leads to improved results over a more complex approach that preserves word order.",
    	booktitle = "Advances in Computational Intelligence and Learning: Methods and Applications",
    	editor = {Hans-J\{"}{u}rgen Zimmermann and Georgios Tselentis and Maarsten van Someren and Georgios Dounias},
    	isbn = "978-0-7923-7645-3",
    	keywords = "named entity recognition, tree induction, neural networks",
    	month = "January",
    	note = "http://www.springer.com/mathematics/book/978-0-7923-7645-3",
    	pages = "193--210",
    	publisher = "Springer Berlin / Heidelberg",
    	series = "International Series in Intelligent Technologies",
    	title = "{S}ymbolic and {N}eural {L}earning of {N}amed-{E}ntity {R}ecognition and {C}lassification {S}ystems in {T}wo {L}anguages",
    	url = "http://www.ellogon.org/petasis/bibliography/COIL2000/COILBook2001.pdf",
    	volume = 18,
    	year = 2002
    }
    

Year: 2001

  1. Georgios Petasis, Vangelis Karkaletsis, Dimitra Farmakiotou, Ion Androutsopoulos and Constantine D Spyropoulos.
    A Greek Morphological Lexicon and its Exploitation by a Greek Controlled Language Checker.
    In Proceedings of the 8th Panhellenic Conference on Informatics (PCI'01). 2001, 80–89.
    URL BibTeX

    @inproceedings{Petasis:2001:GML:1756269.1756295,
    	author = "Petasis, Georgios and Karkaletsis, Vangelis and Dimitra Farmakiotou and Ion Androutsopoulos and Constantine D. Spyropoulos",
    	abstract = {This paper presents a large-scale Greek morphological lexicon, developed by the Software {\&} Knowledge Engineering Laboratory (SKEL) of NCSR {"}Demokritos{"}. The paper describes the lexicon architecture and the procedure to develop and update it. The morphological lexicon was used to develop a lemmatiser and a morphological analyser that were included in a controlled language checker for Greek. The paper discusses the current coverage of the lexicon, as well as remaining issues and how we plan to address them. Our goal is to produce a wide-coverage morphological lexicon of Greek that can be easily exploited in several natural language processing applications.},
    	booktitle = "Proceedings of the 8th Panhellenic Conference on Informatics (PCI'01)",
    	month = "November 8--10",
    	pages = "80--89",
    	series = "PCI'01",
    	title = "{A} {G}reek {M}orphological {L}exicon and its {E}xploitation by a {G}reek {C}ontrolled {L}anguage {C}hecker",
    	url = "http://www.ellogon.org/petasis/bibliography/PCI2001/EPY-Morph-CameraReady.pdf",
    	year = 2001
    }
    
  2. Georgios Petasis, Frantz Vichot, Francis Wolinski, Georgios Paliouras, Vangelis Karkaletsis and Constantine D Spyropoulos.
    Using Machine Learning to Maintain Rule-based Named - Entity Recognition and Classification Systems.
    In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics. 2001, 426–433.
    URL, DOI BibTeX

    @inproceedings{Petasis:2001:UML:1073012.1073067,
    	author = "Petasis, Georgios and Frantz Vichot and Francis Wolinski and Georgios Paliouras and Karkaletsis, Vangelis and Constantine D. Spyropoulos",
    	abstract = "This paper presents a method that assists in maintaining a rule-based named-entity recognition and classification system. The underlying idea is to use a separate system, constructed with the use of machine learning, to monitor the performance of the rule-based system. The training data for the second system is generated with the use of the rule-based system, thus avoiding the need for manual tagging. The disagreement of the two systems acts as a signal for updating the rule-based system. The generality of the approach is illustrated by applying it to large corpora in two different languages: Greek and French. The results are very encouraging, showing that this alternative use of machine learning can assist significantly in the maintenance of rule-based systems.",
    	address = "Toulouse, France",
    	booktitle = "Proceedings of the 39th Annual Meeting on Association for Computational Linguistics",
    	doi = "http://dx.doi.org/10.3115/1073012.1073067",
    	month = "July 9--11",
    	pages = "426--433",
    	publisher = "Association for Computational Linguistics",
    	series = "ACL '01",
    	title = "{U}sing {M}achine {L}earning to {M}aintain {R}ule-based {N}amed - {E}ntity {R}ecognition and {C}lassification {S}ystems",
    	url = "http://www.ellogon.org/petasis/bibliography/ACL2001/ACL-2001-CameraReady.pdf",
    	year = 2001
    }
    
  3. Vangelis Karkaletsis, Georgios Samaritakis, Georgios Petasis, Dimitra Farmakiotou, Ion Androutsopoulos and Constantine D Spyropoulos.
    A Controlled Language Checker Based on the Ellogon Text Engineering Platform.
    In Proceedings from Language Technologies 2001: The Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001). 2001, 74–75.
    URL BibTeX

    @inproceedings{NAACL2001-Karkaletsis,
    	author = "Vangelis Karkaletsis and Georgios Samaritakis and Petasis, Georgios and Dimitra Farmakiotou and Ion Androutsopoulos and Constantine D. Spyropoulos",
    	address = "Pittsburgh, PA, USA",
    	booktitle = "Proceedings from Language Technologies 2001: The Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001)",
    	month = "June 2--7",
    	organization = "Carnegie Mellon University",
    	pages = "74--75",
    	title = "{A} {C}ontrolled {L}anguage {C}hecker {B}ased on the {E}llogon {T}ext {E}ngineering {P}latform",
    	year = 2001,
    	url = "http://www.ellogon.org/petasis/bibliography/NAACL2001/NAACL01-demo-ABSTRACT.pdf",
    	keywords = "controlled languages, Modern Greek"
    }
    

Year: 2000

  1. Georgios Paliouras, Vangelis Karkaletsis, Georgios Petasis and Constantine D Spyropoulos.
    Learning Decision Trees for Named-Entity Recognition and Classification.
    In Proceedings of the 14th European Conference on Artificial Intelligence (ECAI 2000). 2000.
    URL BibTeX

    @inproceedings{ECAI2000-Petasis,
    	author = "Georgios Paliouras and Karkaletsis, Vangelis and Petasis, Georgios and Constantine D. Spyropoulos",
    	abstract = {We propose the use of decision tree induction as a solution to the problem of customising a named-entity recognition and classification (NERC) system to a specific domain. A NERC system assigns semantic tags to phrases that correspond to named entities, e.g. persons, locations and organisations. Typically, such a system makes use of two language resources: a recognition grammar and a lexicon of known names, classified by the corresponding named-entity types. NERC systems have been shown to achieve good results when the domain of application is very specific. However, the construction of the grammar and the lexicon for a new domain is a hard and time-consuming process. We propose the use of decision trees as NERC {"}grammars{"} and the construction of these trees using machine learning. In order to validate our approach, we tested C4.5 on the identification of person and organisation names involved in management succession events, using data from the sixth Message Understanding Conference. The results of the evaluation are very encouraging showing that the induced tree can outperform a grammar that was constructed manually.},
    	booktitle = "Proceedings of the 14th European Conference on Artificial Intelligence (ECAI 2000)",
    	month = "August 20--25",
    	series = "ECAI 2000",
    	title = "{L}earning {D}ecision {T}rees for {N}amed-{E}ntity {R}ecognition and {C}lassification",
    	url = "http://www.ellogon.org/petasis/bibliography/ECAI2000/ECAI-2000.pdf",
    	year = 2000
    }
    
  2. Georgios Petasis.
    Machine Learning and Named-Entity Recognition.
    In Proceedings of the 8th ELSNET European Summer School on Language and Speech Communication on the subject of Text and Speech Triggered Information Access (TeSTIA 2000). 2000.
    BibTeX

    @inproceedings{TESTIA2000-Petasis,
    	author = "Petasis, Georgios",
    	address = "Chios, Greece",
    	booktitle = "Proceedings of the 8th ELSNET European Summer School on Language and Speech Communication on the subject of Text and Speech Triggered Information Access (TeSTIA 2000)",
    	month = "July 15--30",
    	title = "{M}achine {L}earning and {N}amed-{E}ntity {R}ecognition",
    	year = 2000
    }
    
  3. Georgios Petasis, Alessandro Cucchiarelli, Paola Velardi, Georgios Paliouras, Vangelis Karkaletsis and Constantine D Spyropoulos.
    Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods.
    In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 2000, 128–135.
    URL, DOI BibTeX

    @inproceedings{Petasis:2000:AAP:345508.345563,
    	author = "Petasis, Georgios and Alessandro Cucchiarelli and Paola Velardi and Georgios Paliouras and Karkaletsis, Vangelis and Constantine D. Spyropoulos",
    	abstract = "The recognition of Proper Nouns (PNs) is considered an important task in the area of Information Retrieval and Extraction. However the high performance of most existing PN classifiers heavily depends upon the avail-ability of large dictionaries of domain-specific Proper Nouns, and a certain amount of manual work for rule writing or manual tagging. Though it is not a heavy requirement to rely on some existing PN dictionary (of-ten these resources are available on the web), its coverage of a domain corpus may be rather low, in absence of manual updating. In this paper we propose a technique for the automatic updating of a PN Dictionary through the cooperation of an inductive and a probabilistic classifier. In our experiments we show that, whenever an existing PN Dictionary allows the identification of 50{\%} of the proper nouns within a corpus, our technique allows, without additional manual effort, the successful recognition of about 90{\%} of the remaining 50{\%}.",
    	address = "New York, NY, USA",
    	booktitle = "Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)",
    	doi = "http://doi.acm.org/10.1145/345508.345563",
    	isbn = "1-58113-226-3",
    	keywords = "information extraction, machine learning and IR, natural language processing for IR, text data mining",
    	month = "July 24--28",
    	pages = "128--135",
    	publisher = "ACM",
    	series = "SIGIR '00",
    	title = "{A}utomatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods",
    	url = "http://www.ellogon.org/petasis/bibliography/SIGIR2000/SIGIR-CameraReady.pdf",
    	year = 2000
    }
    
  4. Georgios Petasis, Sergios Petridis, Georgios Paliouras, Vangelis Karkaletsis, Stavros J Perantonis and Constantine D Spyropoulos.
    Symbolic and Neural Learning for Named-Entity Recognition.
    In Proceedings of European Best Practice Workshops and Symposium on Computational Intelligence and Learning (COIL 2000). 2000, 58–66.
    URL BibTeX

    @inproceedings{Petasis00c.:symbolic,
    	author = "Petasis, Georgios and Sergios Petridis and Georgios Paliouras and Karkaletsis, Vangelis and Stavros J. Perantonis and Constantine D. Spyropoulos",
    	abstract = "Named-entity recognition involves the identification and classification of named entities in text. This is an important subtask in most language engineering applications, in particular information extraction, where different types of named entity are associated with specific roles in events. The manual construction of rules for the recognition of named entities is a tedious and time-consuming task. For this reason, we present in this paper two approaches to learning named-entity recognition rules from text. The first approach is a decision-tree induction method and the second a multi-layered feed-forward neural network. Particular emphasis is paid on the selection of the appropriate feature set for each method and the extraction of training examples from unstructured textual data. We compare the performance of the two methods on a large corpus of English text and present the results.",
    	address = "Chios, Greece",
    	booktitle = "Proceedings of European Best Practice Workshops and Symposium on Computational Intelligence and Learning (COIL 2000)",
    	keywords = "name entity recognition, tree induction, neural networks",
    	month = "June 19--23",
    	pages = "58--66",
    	title = "{S}ymbolic and {N}eural {L}earning for {N}amed-{E}ntity {R}ecognition",
    	url = "http://www.ellogon.org/petasis/bibliography/COIL2000/COIL-2000.pdf",
    	year = 2000
    }
    
  5. Georgios Petasis, Georgios Paliouras, Vangelis Karkaletsis, Constantine D Spyropoulos and Ion Androutsopoulos.
    Using Machine Learning Techniques for Part-Of-Speech Tagging in the Greek Language.
    In Dimitrios I Fotiadis and Stavros D Nikolopoulos (eds.). ADVANCES IN INFORMATICS: Proceedings of the 7th Hellenic Conference on Informatics (HCI '99). World Scientific, May 2000, pages 273–281.
    http://www.worldscibooks.com/compsci/4320.html.
    URL BibTeX

    @incollection{HCI1999-Petasis,
    	author = "Petasis, Georgios and Georgios Paliouras and Karkaletsis, Vangelis and Constantine D. Spyropoulos and Ion Androutsopoulos",
    	abstract = {This article investigates the use of Transformation-Based Error-Driven learning for resolving part-of-speech ambiguity in the Greek language. The aim is not only to study the performance, but also to examine its dependence on different thematic domains. Results are presented here for two different test cases: a corpus on {"}management succession events{"} and a general-theme corpus. The two experiments show that the performance of this method does not depend on the thematic domain of the corpus, and its accuracy for the Greek language is around 95{\%}.},
    	booktitle = "ADVANCES IN INFORMATICS: Proceedings of the 7th Hellenic Conference on Informatics (HCI '99)",
    	editor = "Dimitrios I. Fotiadis and Stavros D. Nikolopoulos",
    	isbn = "978-981-02-4192-6",
    	month = "May",
    	note = "http://www.worldscibooks.com/compsci/4320.html",
    	pages = "273--281",
    	publisher = "World Scientific",
    	title = "{U}sing {M}achine {L}earning {T}echniques for {P}art-{O}f-{S}peech {T}agging in the {G}reek {L}anguage",
    	url = "http://www.ellogon.org/petasis/bibliography/HCI1999/EPY99.pdf",
    	year = 2000
    }
    

Year: 1999

  1. Vangelis Karkaletsis, Georgios Paliouras, Georgios Petasis, Natasa Manousopoulou and Constantine D Spyropoulos.
    Named-Entity Recognition from Greek and English Texts.
    Journal of Intelligent and Robotic Systems 26(2):123–135, October 1999.
    URL, DOI BibTeX

    @article{Karkaletsis:1999:NRG:595358.595565,
    	author = "Karkaletsis, Vangelis and Georgios Paliouras and Petasis, Georgios and Natasa Manousopoulou and Constantine D. Spyropoulos",
    	abstract = "Named-entity recognition (NER) involves the identification and classification of named entities in text. This is an important subtask in most language engineering applications, in particular information extraction, where different types of named entity are associated with specific roles in events. In this paper, we present a prototype NER system for Greek texts that we developed based on a NER system for English. Both systems are evaluated on corpora of the same domain and of similar size. The time-consuming process for the construction and update of domain-specific resources in both systems led us to examine a machine learning method for the automatic construction of such resources for a particular application in a specific language.",
    	address = "Hingham, MA, USA",
    	doi = "10.1023/A:1008124406923",
    	issn = "0921-0296",
    	journal = "Journal of Intelligent and Robotic Systems",
    	keywords = "information extraction, machine learning, named-entity recognition",
    	month = "October",
    	number = 2,
    	pages = "123--135",
    	publisher = "Kluwer Academic Publishers",
    	title = "{N}amed-{E}ntity {R}ecognition from {G}reek and {E}nglish {T}exts",
    	url = "http://www.ellogon.org/petasis/bibliography/JIRS1999/JIRS-1999.pdf",
    	volume = 26,
    	year = 1999
    }
    
  2. Georgios Petasis.
    Exploiting Learning in Bilingual Named Entity Recognition.
    In Proceedings of the ECCAI Advanced Course on Artificial Intelligence (ACAI '99). 1999.
    URL BibTeX

    @inproceedings{ACAI1999-Petasis2,
    	author = "Petasis, Georgios",
    	address = "Chania, Greece",
    	booktitle = "Proceedings of the ECCAI Advanced Course on Artificial Intelligence (ACAI '99)",
    	month = "July 5--16",
    	title = "{E}xploiting {L}earning in {B}ilingual {N}amed {E}ntity {R}ecognition",
    	year = 1999,
    	url = "http://www.ellogon.org/petasis/bibliography/ACAI1999/ss1_07.pdf"
    }
    
  3. Georgios Petasis, Georgios Paliouras, Vangelis Karkaletsis and Constantine D Spyropoulos.
    Resolving Part-of-Speech Ambiguity in the Greek Language Using Learning Techniques.
    In Proceedings of the ECCAI Advanced Course on Artificial Intelligence (ACAI '99). 1999.
    URL BibTeX

    @inproceedings{Petasis99resolvingpart-of-speech,
    	author = "Petasis, Georgios and Georgios Paliouras and Karkaletsis, Vangelis and Constantine D. Spyropoulos",
    	abstract = {This article investigates the use of Transformation-Based Error-Driven learning for resolving part-of-speech ambiguity in the Greek language. The aim is not only to study the performance, but also to examine its dependence on different thematic domains. Results are presented here for two different test cases: a corpus on {"}management succession events{"} and a general-theme corpus. The two experiments show that the performance of this method does not depend on the thematic domain of the corpus, and its accuracy for the Greek language is around 95{\%}.},
    	address = "Chania, Greece",
    	booktitle = "Proceedings of the ECCAI Advanced Course on Artificial Intelligence (ACAI '99)",
    	month = "July 5--16",
    	title = "{R}esolving {P}art-of-{S}peech {A}mbiguity in the {G}reek {L}anguage {U}sing {L}earning {T}echniques",
    	url = "http://www.ellogon.org/petasis/bibliography/ACAI1999/9906019.pdf",
    	year = 1999
    }
    
  4. Vangelis Karkaletsis, Constantine D Spyropoulos and Georgios Petasis.
    Named Entity Recognition from Greek texts: the GIE Project.
    In Spyros G Tzafestas (ed.). Advances in Intelligent Systems: Concepts, Tools and Applications. Series Intelligent Systems, Control and Automation: Science and Engineering, volume 21, Springer Berlin / Heidelberg, 1999, pages 131–142.
    Presented at the 3rd European Robotics Intelligent Systems & Control Conference (EURISCON '98), June 22–25 1998, Athens, Greece..
    URL BibTeX

    @incollection{EURISCON1998-Karkaletsis,
    	author = "Vangelis Karkaletsis and Constantine D. Spyropoulos and Petasis, Georgios",
    	booktitle = "Advances in Intelligent Systems: Concepts, Tools and Applications",
    	chapter = 12,
    	editor = "Spyros G. Tzafestas",
    	isbn = "978-1-4020-0393-6",
    	keywords = "named-entity recognition, information extraction, machine learning",
    	note = "Presented at the 3rd European Robotics Intelligent Systems {\&} Control Conference (EURISCON '98), June 22--25 1998, Athens, Greece.",
    	pages = "131--142",
    	publisher = "Springer Berlin / Heidelberg",
    	series = "Intelligent Systems, Control and Automation: Science and Engineering",
    	title = "{N}amed {E}ntity {R}ecognition from {G}reek texts: the {GIE} {P}roject",
    	url = "http://www.springer.com/computer/image+processing/book/978-1-4020-0393-6",
    	volume = 21,
    	year = 1999
    }
    

Download all publications in a single bibtex file.