Completed course Introduction to programming.
Data analysis with program R
Introduction. R as a calculator.
Spreadsheets, units, variables. Measurement scales. Data preparation and editing. Excel, CSV. Reading and storing.
Scalars and elementary data types in R. Type coercion, unknown values, NA. Vectors, vector attributes, matrices, and indexing. Lists and data tables.
Ordinal and nominal data. Representation in R. Organization of data in data tables. Importing and cleaning data from spreadsheets and Web pages. Regular expressions.
Analysis of tabular data with basics of query language SQL, groups, and summarization.
Tabular data visualization, the grammar of statistical diagrams ggplot2, aesthetic properties, aesthetic mappings, scales, and legends. Data visualizations on geographical maps.
Clustering. Measures of distance between objects and clusters. Agglomerative clustering method and k-means algorithm. Evaluating the quality of clustering and silhouettes.
- E. Jones, S. Harden, M. J. Crawley: The R book, 3rd ed., Chichester : Wiley, 2022.
- P. Murrell: R graphics, 3rd ed., Routledge : CRC Press, 2021.
- H. Wickham: Advanced R, 2nd ed., Boca Raton : CRC Press, 2019. Prosto dostopna na https://adv-r.hadley.nz/
- H. Wickham, M. Çetinkaya-Rundel, G. Grolemund: R for Data Science, 2nd ed., Beijing : O’Reilly, 2023. Prosto dostopna na https://r4ds.hadley.nz/
- Spletni strani https://www.r-project.org in https://posit.co/download/rstudio-desktop/
Students learn programming language R with the corresponding environment. Using the language they learn basics of statistical data analysis and visualization.
Knowledge and understanding: Student learns programming package R designed primarily for statistical data analysis and visualization. Student upgrades her/his knowledge of basic programming techniques and learns some special features of language R.
Application: Builiding of user's libraries, preparation od charts, simple data analysis.
Reflection: The importance of modern information technology in analysis of large amounts of data, the importance of visualization in data exploration and presentation of results.
Transferable skills: Working with a computer, algorithmic way of thinking.
Lectures, exercises, homework, consultations
Homeworks, final project
Theoretical exam
grading: 5 (fail), 6-10 (pass) (according to the Statute of UL)
Andrej Bauer:
– BAUER, Andrej, BIRKEDAL, Lars. Continuous functionals of dependent types and equilogical spaces. V: CLOTE, Peter G. (ur.). Computer science logic : 14th international workshop, CSL 2000, annual conference of the EACSL, Fischbachau, Germany, August 21-26, 2000 : proceedings, (Lecture notes in computer science, ISSN 0302-9743, 1862). Berlin [etc.]: Springer, 2000, vol. 1862, str. 202-216 [COBISS-SI-ID 10606681]
– BAUER, Andrej, TAYLOR, Paul. The Dedekind reals in abstract Stone duality. Mathematical structures in computer science, ISSN 0960-1295, 2009, vol. 19, iss. 4, str. 757-838 [COBISS-SI-ID 15322201]
– BAUER, Andrej, STONE, Christopher A. RZ: a tool for bringing constructive and computable mathematics closer to programming practice. Journal of logic and computation, ISSN 0955-792X, 2009, vol. 19, no. 1, str. 17-43 [COBISS-SI-ID 15325785]
Ljupčo Todorovski:
– MEŽNAR, Sebastian, DŽEROSKI, Sašo, TODOROVSKI, Ljupčo. Efficient generator of mathematical expressions for symbolic regression. Machine learning. Nov. 2023, vol. 112, iss. 11, str. 4563-4596 [COBISS-SI-ID 176785923]
– BRENCE, Jure, DŽEROSKI, Sašo, TODOROVSKI, Ljupčo. Dimensionally-consistent equation discovery through probabilistic attribute grammars. Information Sciences. Jun. 2023, vol. 632, str. 742-756 [COBISS-SI-ID 151276803]
– BRENCE, Jure, TODOROVSKI, Ljupčo, DŽEROSKI, Sašo. Probabilistic grammars for equation discovery. Knowledge-based systems. [Print ed.]. 2021, vol. 224, str. 107077-1-107077-12. [COBISS-SI-ID 61709059]