ECTS credits ECTS credits: 6
ECTS Hours Rules/Memories Hours of tutorials: 1 Expository Class: 21 Interactive Classroom: 21 Total: 43
Use languages English
Type: Ordinary subject Master’s Degree RD 1393/2007 - 822/2021
Departments: Electronics and Computing, External department linked to the degrees
Areas: Computer Architecture and Technology, Área externa M.U en Intelixencia Artificial
Center Higher Technical Engineering School
Call: First Semester
Teaching: With teaching
Enrolment: Enrollable
The ever-increasing amount of information accessible through the Internet makes the efficient processing of large amounts of data of increasing interest. This has led to the development of new techniques for storing and processing huge amounts of information, techniques that are naturally adapted to distributed systems.
The main objective of this subject is to provide students with the knowledge and skills necessary to understand, develop and apply artificial intelligence (AI) techniques in Big Data environments.
1. Introduction to Big Data
a) What is Big Data
b) Big Data Applications
c) Big Data Analytics
d) Problems of data analysis in Big Data environments
2. Data preparation and visualisation
a) Data pre-processing techniques
b) Visualisation techniques
3. Federated learning
a) Edge learning
b) Privacy preservation
4. Infrastructures for Big Data storage and processing
a) Parallelism and distributed-memory systems
b) High Performance Computing versus Big Data Computing
c) Apache Hadoop and MapReduce
5. Large-scale data processing: Apache Spark
a) Batch and streaming processing
b) Architecture
c) Spark Core (RDDs) and Spark SQL, DataSets and DataFrames
d) Spark DataFrames
6. Machine Learning with Apache Spark
a) Machine Learning workflow
b) Supervised and unsupervised machine learning
c) Tuning, evaluation and pipelines
Basic bibliography
- Class notes provided by the professors
- A. Polak, Scaling Machine Learning with Spark, O'Reilly, 2023
- I. Triguero, M. Galar, Large-Scale Data Analytics with Python and Spark, Cambridge University Press, 2023
Complementary bibliography
- T. White, Hadoop: The Definitive Guide, 4th Edition, O'Reilly, 2015
- J. Damji, B. Wenig, T. Das and D. Lee. Learning Spark, 2nd Edition, O'Reilly, 2020
As a result of the learning process, students who take this course will be able to:
- Know the techniques that allow the design of scalable AI techniques at software and hardware resources level.
- Acquire the skills to integrate large volume and variety of data in AI Big Data projects.
- To know the scalability paradigms in machine learning algorithms.
- Understand, analyse and design the necessary infrastructures for Big Data AI projects: local/cloud environment and physical/virtual equipment with low latency storage systems and distributed file systems.
- To know the languages, frameworks and components that allow us to increase performance in hardware infrastructures with CPU and GPU.
- To know the techniques that allow, with low latency, the visualisation of data in environments with large volume of information.
- Use and be able to apply the correct KPIs in each environment.
Competences of the degree that are worked (see degree memory):
- Basic and general: CB6, CB7, CB8, CG2, CG3, CG4, CG5.
- Transversal: CT3, CT7, CT8, CT9.
- Specific: CE10, CE11, CE12, CE13, CE14, CE15.
The classes will be face-to-face and will be transmitted synchronously to the other campuses.
- Theory classes, in which the content of each topic is exposed. The student will have copies of the transparencies beforehand, and the professor will promote an active attitude, asking questions to clarify specific aspects and leaving open questions for the student's reflection.
- Practical classes in the computer classroom, which allow the student to familiarize himself/herself from a practical point of view with the issues exposed in the theoretical classes.
- Learning based on problems, seminars, case studies or projects, which allow students to acquire certain competences based on the resolution of exercises competencies based on the resolution of exercises, case studies and projects.
Classroom-based training activities and their relationship with the competencies of the degree course:
- Theory classes: taught by the professor and seminar exposition. Competences covered: CG2, CB6, CE10, CT3, CE11, CE12, CE13, CE14, CE15.
- Practical laboratory classes and problem-based learning. Competences covered: CG2, CG5, CB7, CT7, CT8, CT9.
- Scheduled tutorials: guidance for the realization of individual or group work, resolution of doubts and continuous evaluation activities. Worked competences: T1.
Non-attendance training activities and their relation with the competences of the degree:
- Personal work of the student: consultation of bibliography, autonomous study, development of programmed activities, preparation of presentations and works. Worked competences: CG2, CG3, CG4, CG5, CT3, CT7, CT8, CT9.
All essential and complementary training and information material will be accessible in Microsoft Teams (UDC).
- Evaluation of practical work: 50% of the grade. The solutions proposed by the students to the proposed practices will be evaluated. The evaluation of practical work can be carried out
by means of a correction by the teacher, a defence of the solution provided by the student before the teacher or an oral presentation of the developed solution. All work must be delivered before the dates that will be specified and must meet minimum quality requirements to be taken into consideration. The degree of compliance with the specifications, the methodology and rigorousness and the presentation of results will be evaluated. In this part the competences CG2, CG3, CG4, CG5, CB6, CB7, CB8, CT3, CT7, CT8, CT9, CE10, CE11, CE12, CE13, CE14, CE15 will be evaluated implicitly or explicitly.
- Evaluation of theoretical work: 5% of the grade. The completion of collaborative learning projects will be evaluated, where students will work (preferably in pairs or groups) to develop a scientific article in detail, related to the topics covered in theory, and present it to the entire class, where questions can be asked. These projects can be completed during non-face-to-face teaching hours, and their objective is to deepen the content of the subject, as well as to acquire competencies in critical analysis, summarization, and oral presentation. The degree of compliance with the specifications, methodology, rigor, and presentation of results will be assessed. In this part, the competencies CG2, CG3, CG4, CG5, CB1, CB3, CT3, CT4, CT7, CT8, CT9, CE11, and CE15 will be evaluated implicitly or explicitly.
- Final exam: 45% of the grade. In this part the competences CG2, CB6, CB7, CB8, CT9, CE10, CE11, CE12, CE13, CE14, CE15 will be evaluated implicitly or explicitly.
In order to pass the course, a total score of 5 or higher must be achieved and it is essential to have completed on time all the practicals indicated as compulsory. Late submissions will not be assessed.
Condition for qualification of Not Presented: not submitting any practice and not attending the final exam.
Students who are not newly enrolled do not retain grades from previous courses.
Recovery opportunity (July) and extraordinary:
The assessment will be the same as in the ordinary opportunity. Students who did not submit the proposed work throughout the term must submit it before the established date.
Condition for qualification of Not Presented: not submitting any practice and not attending the final exam.
In the case of fraudulent performance of exercises or tests, the regulations of the Normativa de avaliación do rendemento académico dos estudantes e de revisión de cualificacións will be applied.
In the application of the Normativa da ETSE sobre plaxio (approved by the ETSE Council on 12/19/2019), the total or partial copy of any practical ot theory exercise will mean failure on both opportunities of the course, with a grade of 0.0 in both cases.
- Theory classes: 21 hours attendance, 42 hours total dedication.
- Practical laboratory classes: 14 classroom hours, 60 hours total dedication.
- Learning based on problems, seminars, case studies and projects: 7 attendance hours, 48 hours total dedication.
- Total: 150 h
Due to the strong interrelation between the theoretical part and the practical part, and the progressive presentation of closely related concepts in the theoretical part, it is advisable to dedicate a daily study or review time.
Intensive use will be made of online communication tools: videoconference, chat, etc.
Classes will be taught in English. Both the class teaching material and the bibliography are entirely in English.
Alvaro Ordoñez Iglesias
Coordinador/a- Department
- Electronics and Computing
- Area
- Computer Architecture and Technology
- Phone
- 881815508
- alvaro.ordonez [at] usc.es
- Category
- Professor: LOU (Organic Law for Universities) PhD Assistant Professor
Monday | |||
---|---|---|---|
17:00-18:30 | Grupo /CLE_01 | English | IA.12 |
18:30-20:00 | Grupo /CLIL_01 | English | IA.12 |
01.10.2025 10:30-14:00 | Grupo /CLIL_01 | IA.12 |
01.10.2025 10:30-14:00 | Grupo /CLE_01 | IA.12 |
06.17.2025 10:30-14:00 | Grupo /CLIL_01 | IA.12 |
06.17.2025 10:30-14:00 | Grupo /CLE_01 | IA.12 |