Skip to main content

Topics in data analysis

2024/2025
Programme:
Financial mathematics, First Cycle
Year:
3 year
Semester:
second
Kind:
optional
Group:
B
ECTS:
5
Language:
slovenian
Lecturer (contact person):
Hours per week – 2. semester:
Lectures
2
Seminar
0
Tutorial
2
Lab
0
Prerequisites

Completed course Introduction to programming.

Content (Syllabus outline)

Introduction to machine learning, data, predictive models, loss functions, evaluating the accuracy and generality of predictive models, problems with overfitting, and the curse of dimensionality.
Algorithms for supervised machine learning of predictive models from data: decision trees, support vector machines, neural networks, and model ensembles.
Algorithms for unsupervised learning: clustering, principal component analysis.
Using machine learning algorithms for data analysis: dimensionality reduction, handling missing data, embeddings for dealing with text.
We will learn how machine learning algorithms work and their mathematical background, with the purpose of their efficient use in solving practical data analysis tasks.

Readings
  1. P. Flach: Machine learning : the art and science of algorithms that make sense of data, Cambridge : Cambridge University Press, 2017.
  2. G. James … [et al.]: An introduction to statistical learning with applications in Python, Cham : Springer, 2023. Prosto dostopna na https://www.statlearning.com/
  3. M. Kuhn, K. Johnson: Applied predictive modeling, New York : Springer, cop. 2019.
  4. F. Pedregosa … Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research 12(85) (2011), str. 2825-2830.
Objectives and competences

During the lectures, the student learns about the operation of a wide range of algorithms for machine learning, from supervised learning of decision trees through support vector machines and neural networks for regression and classification to unsupervised clustering. During the exercises, through practical work on various selected databases, the student learns the basics of data analysis with machine learning.

Intended learning outcomes

Knowledge and understanding: Students become familiar with different machine learning algorithms, understand their operation and the influence of their settings on the obtained predictive models and patterns, and know how to formulate a data analysis task that makes use of machine learning.

Application: Practical application of machine learning algorithms for various database analysis tasks. At the same time, it improves knowledge of basic data-analytical receipts.

Reflection: Determining the strengths and weaknesses of individual machine learning algorithms, implementing simple machine learning algorithms, upgrading their implementations, and choosing the appropriate algorithm for a given data analysis task.

Transferable skills: Working with a computer, as well as data-analytical and algorithmic ways of thinking.

Learning and teaching methods

Lectures, exercises, homework, consultations

Assessment

Homeworks and a project
Oral exam
grading: 5 (fail), 6-10 (pass) (according to the Statute of UL)

Lecturer's references

Alexander Keith Simpson:
– EGGER, Jeff, MØGELBERG, Rasmus Ejlers, SIMPSON, Alex. The enriched effect calculus: syntax and semantics. Journal of logic and computation, ISSN 0955-792X, 2014, vol. 24, iss. 3, str. 615-654 [COBISS-SI-ID 17090137]
– EGGER, Jeff, MØGELBERG, Rasmus Ejlers, SIMPSON, Alex. Linear-use CPS translations in the enriched effect calculus. Logical methods in computer science, ISSN 1860-5974, 2012, vol. 8, iss. 4, paper 2 (str. 1-27) [COBISS-SI-ID 17090905]

Ljupčo Todorovski:
– MEŽNAR, Sebastian, DŽEROSKI, Sašo, TODOROVSKI, Ljupčo. Efficient generator of mathematical expressions for symbolic regression. Machine learning. Nov. 2023, vol. 112, iss. 11, str. 4563-4596 [COBISS-SI-ID 176785923]
– BRENCE, Jure, DŽEROSKI, Sašo, TODOROVSKI, Ljupčo. Dimensionally-consistent equation discovery through probabilistic attribute grammars. Information Sciences. Jun. 2023, vol. 632, str. 742-756 [COBISS-SI-ID 151276803]
– BRENCE, Jure, TODOROVSKI, Ljupčo, DŽEROSKI, Sašo. Probabilistic grammars for equation discovery. Knowledge-based systems. [Print ed.]. 2021, vol. 224, str. 107077-1-107077-12. [COBISS-SI-ID 61709059]