-
ECTS credits
ECTS credits: 5ECTS Hours Rules/Memories
Hours of tutorials: 5
Expository Class: 15
Interactive Classroom: 20
Total: 40Use languages
German, EnglishType:
Ordinary subject Master’s Degree RD 1393/2007 - 822/2021Departments:
External department linked to the degreesAreas:
Área externa M.U Erasmus Mundus en Lexicografía (2ªed)Center
Faculty of PhilologyCall:
Second SemesterTeaching:
With teachingEnrolment:
Enrollable | 1st year (Yes) -
The students should be able
• to formulate their corpus requirements for a lexicographic project and specify the design of a representative corpus;
• to compile such a corpus from Web pages or other sources;
• to annotate the corpus with linguistic information using automatic natural language processing tools;
• to search the corpus with regular expressions and more complex queries based on lexico-grammatical patterns;
• to apply quantitative techniques such as collocation or keyword analysis and interpret the results appropriately;
• to communicate the results of their work to fellow students;
• to lead academic discussions about technical and methodological aspects of corpus-based research; and
• to document and archive corpus data and analysis results.Foundations of corpus linguistics
• Principles and methods of corpus analysis
• Applications of corpus data in lexicography
• Types of corpora, overview of existing corpora
• Corpus design, representativity, data sources, metadata
Corpus compilation
• Building corpora from online data: Web scraping etc.
• Boilerplate removal, normalization, metadata extraction
• Representation and exchange formats
• Online and stand-alone tools for Web corpus compilation
• Automatic linguistic annotation (POS, lemma, NER, parsing, …)
• Online and stand-alone tools for linguistic annotation
Searching corpora
• Regular expressions
• Character encodings and the Unicode standard
• CQP query language for lexico-grammatical patterns
• Practical exercises with SketchEngine and CQPweb
Quantitative analysis
• Frequency lists and metadata distribution
• Collocations and word sketches
• Keyword analysis
• Lexicographic interpretation of results
• Foundations of statistical inference
Reproducibility
• Research methodology and documentation
• Data management, sustainability of corpus resourcesHSK 5.4, Ch. XVIII + XIX
Knowledge or contents: Con05, Con06, Con07, Con10
Abilities or skills: H/D01, H/D05, H/D07, H/D03
Competencies: Comp04, Comp03, Comp09Block seminar (date and duration to be announced)
The teachers choose one of the following (option b recommended):
a) 90-minute final exam on the contents of the seminar
or
b) presentation of class project plus a short paper (ca. 10 pages)
or
c) longer paper (15-20 pages)
2. Opportunity:
The assessment on the second opportunity will be based on the same criteria.
For students who are officially exempt from attending the assessment system will be the same as for the rest.
Academic misconduct (cheating, plagiarism in exercises or tests) will be penalized according to the University regulations on student assessment (“Normativa de avaliación do rendemento académico dos estudantes e de revisión de cualificacións”).Attendance: max. 35
Requirements for participation: Students must obtain 25 ECTS in the first semester
Elective module in the second semester.
Language: German and/or EnglishRequisitos de participación: El alumnado debe obtener 25 ECTS en el primer semestre.
Módulo optativo del segundo semestre.
Lengua de enseñanza: alemán y/o inglés.
-