ECTS credits ECTS credits: 6
ECTS Hours Rules/Memories Student's work ECTS: 99 Hours of tutorials: 3 Expository Class: 24 Interactive Classroom: 24 Total: 150
Use languages Spanish, Galician
Type: Ordinary Degree Subject RD 1393/2007 - 822/2021
Departments: Statistics, Mathematical Analysis and Optimisation
Areas: Statistics and Operations Research
Center Faculty of Mathematics
Call: First Semester
Teaching: With teaching
Enrolment: Enrollable
The objectives of this course are that the students :
- Know the models describing the influence of certain variables (explanatory variables) over another one (response variable).
- Know how to perform model selection, and its application for inference and prediction.
- Acquire an introductory knowledge of the multivariate methods.
Chapter 1. The general linear model.
Multiple linear regression model and general linear model. Inference on the general linear model. Coefficients estimation and interpretation. The F test. Prediction.
Chapter 2. Diagnosis of outliers and influential observations.
Introduction to outliers and influential observations. Leverages. Normality diagnosis. Detection of influential observations: measures of influence. Rules to deal with outliers and influential observations.
Chapter 3. Construction of a regression model.
Polynomial regression. Interactions. Linearized models. Validation of a multiple regression model. Colinearity. Variable selection methods.
Chapter 4. Analysis of variance.
The analysis of variance model. Parametrization of a discrete explanatory variable. Variability decomposition. The F test. Multiple comparisons. Testing the equality of variances.
Chapter 5. Analysis of covariance.
Model with a discrete and a continuous explanatory variables, with and without interactions. Testing principal effects and testing interaction. Model with several discrete and continuous explanatory variables.
Chapter 6. Logistic regression and introduction to generalized linear models.
Logistic regression model: odds and odds-ratio. Maximum likelihood parameter estimation. Estimation algorithms. Inference on the parameters based on their asymptotic distributions and by means of the profile likelihood. Model testing using the deviance. Introduction to the generalized linear models.
Chapter 7. Advanced regression models
Introduction to penalized linear models. Kernel regression models: Estimation and prediction. Introduction to semiparametric regression models.
Chapter 8. General concepts in multivariate analysis.
Exploratory analysis of multidimensional data. Distribution models in multivariate analysis. Inference on the mean vector and the covariance matrix under normality.
Chapter 9. Principal components.
Decomposition of a random vector in its principal components. Properties. Regression models based on principal components.
Chapter 10. Classification methods.
Supervised classification methods: Bayes rule, Fisher linear discriminant rule, quadratic rule and other classification methods based on regression models. Unsupervised classification methods.
BASIC BIBLIOGRAPHY
Everitt, B. (2005). An R and S-Plus companion to multivariate analysis. Springer. (Available on-line through Iacobus)
Faraway, J.J. (2004). Linear models with R. Chapman and Hall.
Faraway, J.J. (2006). Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Chapman and Hall.
Sheather, S.J. (2009). A modern approach to regression with R. Springer. (Available on-line through Iacobus)
COMPLEMENTARY BIBLIOGRAPHY
Agresti, A. (1990). Categorical data analysis. Wiley.
Agresti, A. (1996). An introduction to categorical data analysis. Wiley.
Draper, N.R. and Smith, H. (1998). Applied Regression Analysis. Wiley.
Greene, W.H. (1999). Análisis econométrico. Prentice Hall.
Johnson, R.A. and Wichern, D.W. (2007). Applied multivariate statistical analysis. Pearson Education.
Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning. Springer.
Koch, I. (2014) Analysis of Multivariate and High-dimensional Data. Cambridge.
Hosmer, D.W. and Lemeshow, S. (1989). Applied logistic regression. Wiley.
Mardia, K.V., Kent, J.T. y Bibby, J.M. (1979). Multivariate analysis. Academic Press.
Peña, D. (2002). Regresión y diseño de experimentos. Alianza Editorial.
Peña, D. (2002). Análisis de datos multivariantes. McGraw-Hill.
Ryan, T.P. (1997). Modern Regression Methods. Wiley.
Seber, G.A.F. (1984). Multivariate observations. Wiley.
Venables, W.N. and Ripley, B.D. (2002). Modern applied statistics with S. Springer.
In this course, according to the proposal for the Degree in Mathematics, the following competences (general, specific and cross-area) will be enhanced.
General competences:
[CX1] Know the most relevant concepts, methods and results from different fields in Mathematics, jointly with some historical perspective of their development.
[CX2] Collect and interpret data, information and relevant results; derive conclusions and prepare reports for scientific and technological problems, which require the use of mathematical tools.
[CX3] Apply both theoretical and practical results, such as abstraction and analysis capacities, both in the definition and statement of problems and in the search of solutions, both in academic and professional contexts.
[CX4] Communicate, both in written and oral forms, knowledge, procedures, results and ideas in mathematics, both for specialized and non-specialized audiences.
Specific competences:
[CE1] Understand and use the mathematical language.
[CE2] Know rigorous proofs for some classical theorems for different fields in Mathematics.
[CE3] Design proofs for mathematical results, state conjectures and make up strategies for assessing them.
[CE4] Identify errors in incorrect mathematical reasoning, proposing proofs and counterexamples.
[CE5] Assimilate the definition of a new mathematical object, relate it with other known objects and be able to use it in different contexts.
[CE6] Know how to abstract properties and substantial features in a problem, disentangling them from spurious or occasional ones.
[CE7] Propose, analyze, validate and interpret models in simple real situations, using the most suitable mathematical tools for the desire purposes.
[CE8] Plan and execute algorithms and mathematical methods for problem solving in an academic, technical, financial or social context.
[CE9] Use software for statistical analysis, numerical and symbolic calculus, graphical visualization and optimization, in general, for experimenting in Mathematics and problem solving.
Cross-area competences:
[CT1] Use bibliography and tools for bibliographical resources, specific for Mathematics, including internet access.
[CT2] Manage optimally the working time and organize the available resources, establishing priorities, alternative routes and identifying logical errors in decision making.
[CT3] Check or refute coherently arguments from other people.
[CT5] Read scientific text, both in mother tongue as well as in other relevant languages for sciences, especially in English.
In the next parts devoted to methodology and assessment system, the activities for enhancing competences, as well as the evaluation method, will be detailed.
Independently of the scenario, the course will comprise lectures and interactive sessions, as well as sessions in small groups for guiding the projects. Handouts will be provided, as well as other complementary material for software learning, making use of the USC e-learning platform.
The learning-teaching activities will be organized in the following blocks:
Lectures (1 hour per week): during the lectures, the professors will introduce the theoretical and practical concepts, making use of a multimedia presentation. In these sessions, general competence CX1 (concept knowledge) and specific competences CE1, CE2, CE4 (comprehension and use of mathematical language; knowledge of proofs; understanding of definitions of new objects and relations with others) will be strengthen
Interactive seminars (1 hour per week): interactive sessions are distributed in seminars (for problem solving) and computer labs. During the seminars, the following competences will be fostered: CX2 and CX4 (data interpretation and communication); CE1, CE3, CE4, CE6 and CE7 (comprehension and use of mathematical language; making up proofs, identify errors, abstract properties, proposal and validation of models) and cross-area competence CT3 (check or refute arguments). Specifically, for CX2 and CX4, case studies for students training will be design. On this case studies, CE6 (proposal and model validation) will be also enhanced. Specific competences CE1, CE3, CE4 and CE5, and cross-area competences CT3, will be also promoted by problem solving activities, individually or in groups, and with their presentation during the sessions.
Interactive computer labs (2 hours per week): in these sessions, the students will learn how to handle R for regression and multivariate data analysis. The following competences will be also trained: CX3 (application of theoretical and practical knowledge), CE8 and CE9 (planning and execution of algorithms, and use of software). In addition, CX2, CE1, cE6 and CE7 (already considered in lectures and interactive seminars), will be also strengthen.
Small groups (2 hours): these sessions in small groups are intended to keep tracking of the students work. We will a series of activities to provide the student with a global view of the subject. At the same time, these activities will help the students to detect which topics or techniques need a further revision.
Competences CX5, CT1, CT2 and CT5 refer to independent work, use of bibliography, management of time and organization, and scientific reading in foreign language (English). All these competences will be fostered through the proposal of two case studies: one for regression and another one for multivariate analysis.
The distribution of lectures, seminars and labs is the following:
Chapter 1. General linear model (2h lectures, 2h seminars, 4h labs)
Chapter 2. Model validation (2h lectures, 2h seminars, 3h labs)
Chapter 3. Model construction (1h lecture, 1h seminar, 3h labs)
Chapter 4. ANOVA (1h lecture, 1h seminar, 3h labs)
Chapter 5. ANCOVA (1h lecture, 1h seminar, 3h labs)
Chapter 6. Logistic regression and GLM (2h lecture, 1h seminar, 2h labs)
Chapter 7. Advanced regression models (2h lecture, 2h seminar, 4h labs)
Chapter 8. General concepts in multivariate analysis (1h lecture, 2h seminar, 0h lab)
Chapter 9. Principal component analysis (1h lecture, 1h seminar, 3h labs)
Chapter 10. Classification methods (1h lecture, 1h seminar, 3h labs)
Continuous assessment will represent 40% of the final grade, and the remaining 60% will correspond with the final exam.
Continuous assessment (40%): Continuous assessment will include forms to assess the interactive computer labs, where some practical examples will be solved with the software R and some questions about results interpretation will be formulated. Additionally, the students will have to solve real-data case (individually or by group). The evaluation report from this activity will be given to the students before the exam, so they will have some feedback. The continuous assessment grade will be preserved along the academic year. During the first week, the professors will inform about the dates and deadlines related with the different activities, in order to facilitate the organization of the students' work. With the different tasks that will be proposed during the course, the level of general competences CX2, CX3, CX4 and CX5, as well as specific CE4 and CE8 and all cross-area competences, will be assessed. In addition, CE5, CE6, CE7 and CE9 will be also partially assessed by this method.
Final exam (60%): the final exam will include some theoretical and practical questions on the subject contents. Outputs of statistical analysis with the package used in the computer lab may be included for their interpretation. The final exam will include brief questions and practical exercises, for solving in the computer lab. With the final exam, as well as specific competences CE5, CE6 and CE9, which are partially evaluated, CX1 and CE1, CE2 and CE3 will be assessed.
Continuous assessment tasks and final exam will be the same for all the groups and labs in this course.
The final grade in the ordinary evaluation period will be the maximum between the weighted grade (continuous assessment and exam) and the grade of the exam.
The weight of the continuous assessment in the extraordinary exams will be the same as for the ordinary assessment period. In the second opportunity (July), there will be a final exam and the final grade will be the maximum of three quantities: ordinary opportunity grade, final exam grade or weighted average of continuous assessment and final exam.
Assessment attendance: a student is considered to have attended the assessment if he/she has participated in any of the assessment activities (continuous assessment and/or final test).
It should be noted that in case of fraudulent exercises or tests, the “Normativa de avaliación do rendemento académico dos estudantes e de revisión de cualificacións” will be applied.
Individual work is about one hour and a half for each hour of teaching, including the preparation of the assignments.
Basic knowledge on probability and statistics is required. It is also recommended to have some experience as statistical software user. For a better understanding of the subject, it is advisable to keep in mind the practical meaning of the methods introduced in this course.
Handouts prepared by the lectures will be facilitated through the Virtual Campus (VC). Messages for the course will be distributed through the VC forum. In addition, a MS Teams group will be created for facilitating the communication between lectures and students.
Lectures and interactive session will be on-campus and they will be completed with material in the virtual classroom. The students will find there some bibliographical notes, exercises, material for practice, etc. Via the virtual campus, the students will be also able to perform and handle the required tasks for continuous assessment. Meetings during office hours will be held on-campus or via e-mail.
Manuel Febrero Bande
- Department
- Statistics, Mathematical Analysis and Optimisation
- Area
- Statistics and Operations Research
- Phone
- 881813187
- manuel.febrero [at] usc.es
- Category
- Professor: University Professor
Rosa María Crujeiras Casais
Coordinador/a- Department
- Statistics, Mathematical Analysis and Optimisation
- Area
- Statistics and Operations Research
- Phone
- 881813212
- rosa.crujeiras [at] usc.es
- Category
- Professor: University Professor
Maria Vidal Garcia
- Department
- Statistics, Mathematical Analysis and Optimisation
- Area
- Statistics and Operations Research
- mariavidal.garcia [at] usc.es
- Category
- Ministry Pre-doctoral Contract
Monday | |||
---|---|---|---|
10:00-11:00 | Grupo /CLIL_02 | Galician, Spanish | Computer room 3 |
11:00-12:00 | Grupo /CLIL_01 | Spanish, Galician | Computer room 3 |
12:00-13:00 | Grupo /CLIL_03 | Galician, Spanish | Computer room 3 |
Tuesday | |||
10:00-11:00 | Grupo /CLE_01 | Spanish, Galician | Classroom 06 |
17:00-18:00 | Grupo /CLIL_01 | Galician, Spanish | Computer room 3 |
18:00-19:00 | Grupo /CLIL_03 | Spanish, Galician | Computer room 3 |
19:00-20:00 | Grupo /CLIL_02 | Galician, Spanish | Computer room 3 |
Thursday | |||
13:00-14:00 | Grupo /CLIS_01 | Galician, Spanish | Classroom 02 |
01.21.2025 10:00-14:00 | Grupo /CLE_01 | Computer room 2 |
06.27.2025 10:00-14:00 | Grupo /CLE_01 | Computer room 2 |