Introduction to bioinformatics

2022/2023
Programme:
Computer Science and Mathematics, Second Cycle
Year:
1 in 2 year
Semester:
first
Kind:
optional
ECTS:
6
Language:
slovenian, english
Lecturers:

Blaž Zupan

Hours per week – 1. semester:
Lectures
3
Seminar
1.33
Tutorial
0.67
Lab
0
Content (Syllabus outline)

How similar are living organisms? Have human indeed descended from Neanderthals? How did various species adapt to living environments? Which genes are responsible for susceptibility to various diseases? Why we need a different flu vaccine each year?
Modern biology poses many interesting questions, and never before have we been so close to answering them. Recently developed experimental biotechnologies allow us to gather vast amounts of experimental data. From genomes of various species, including that of H. sapiens, to gene expression, protein concentrations, effects of various chemicals to cell processes, and similar. Vast number of experimental data sets is today available in open, public repositories, and requires further statistical and mathematical analysis to discover useful and applicable patterns. The methods and techniques for such analysis is developed within the field of bioinformatics, which combines techniques from statistics, computer science, mathematics, data mining and visualization, machine learning and artificial intelligence. During the course, the students will in theory and practice get familiar with the following topics:
Basics of molecular biology
Statistical properties of nucleotide sequences
Computational approaches to gene finding and annotation
Sequence alignment (BLAST)
Probabilistic models for nucleotide sequences, Markov chain models
Computational techniques for assessment of genetic distances between species and individuals within the same species
Phylogenetic analysis, computational techniques for construction of evolution trees
Computational comparison of genomes
Analysis of transcriptome, utility of data mining and visualization techniques, gene set enrichment analysis, gene networks, applications in biomedicine
Integrative bioinformatics: how to combine various data sources and various modelling techniques to discover patterns in biomedical data sets
Theoretical study of the above concepts will be accompanied with familiarization with public data repositories and open-source tools to assess the data and perform subsequent analysis. We will use scripting tools (e.g. Python) and already developed bioinformatics libraries (e.g., Biopython and Orange).

Readings

Christianinni N, Hahn MW (2007) Introduction to Computational Genomics: A Case Study Approach. Cambridge University Press, Cambrige.
Durbin et al. (1998) Biological sequence analysis, Cambridge University Press
James D. Watson, Andrew Berry (2004) DNA: The Secret of Life, Arrow Books, UK. (also in Slovene: DNK, skrivnost življenja, Modrijan, Ljubljana, 2007).

Objectives and competences

This is an introductory course to bioinformatics. During the course the students will become familiar with computational methods and tools that can be used in bioinformatics, and with publically available data bases in molecular biology. The course will start with introduction to molecular biology and genomics, which will allow students of computer science to apply mathematical, statistical and computational techniques to problems from evolution of living organisms, interactions of genes and biological processes, interactions between genome and phenotypes and diseases, and similar.

Intended learning outcomes

After successfull completion of the course, the students should be able to:
understand essential koncepts from molecular biology and evolution,
know how and where to access the molecular biology data,
understand computational techniques for sequence analysis,
understand techniques for phylogeny analysis, analysis of gene expression data, and comparison of genomes,
know how to access and analyze molecular biology data by scripting in Python and using Python libraries for bioinformatics,
recognize advantages that computational methods and algorithms may provide in the area of life sciences.

Learning and teaching methods

Combined lecturing with simultaneous use of the blackboard and computer projection (coding, visualization of models, results). Lab work in computer-equipped lecture rooms. Individual and work in team. Emphasis on practical problem solving.

Assessment

Continuing (homework, midterm exams, project work)
Final (written and oral exam)
grading: 5 (fail), 6-10 (pass) (according to the Statute of UL)

Lecturer's references

Pet najpomembnejših del:
Stajdohar M, Rosengarten RD, Kokosar J, Jeran L, Blenkus D, Shaulsky G, Zupan B (2017) dictyExpress: a web-based platform for sequence data management and analytics in Dictyostelium and beyond, BMC Bioinformatics. 2017 Jun 2,18(1):291.
Zitnik M, Zupan B (2016) Jumping across biomedical contexts using compressive data fusion, Bioinformatics 15,32(12):i90-i100.
Zitnik M, Nam EA, Dinh C, Kuspa A, Shaulsky G, Zupan B (2015) Gene prioritization by compressive data fusion and chaining, PLoS Computational Biology 11(10):e1004552.
Staric A, Demsar J, Zupan B (2015) Concurrent software architectures for exploratory data analysis. WIREs Data Mining and Knowledge Discovery 5(4):165-180.
Zitnik M, Zupan B (2015) Data fusion by matrix factorization. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(1):41-53.
Celotna bibliografija je dostopna na SICRISu:
http://sicris.izum.si/search/rsr.aspx?lang=slv&,id=7764.