BioSS Courses

 

Our current portfolio of BioSS courses in statistics, mathematical modelling and molecular sequence analysis is shown below. Most of these courses are held in Edinburgh, Aberdeen and Dundee, but other locations can be offered by arrangement. We are also now offering some of our courses online. The number of course participants is deliberately limited to give more time for interaction between participants and presenters.

For information on course timetables and charges please select from the menu on the right.

If you wish to register for a course, please book online. If your course is not currently scheduled please register interest; this lets us gauge demand and will increase the chances of it running in future. For other enquires, please email BioSS Training.

How to download R.

Online Courses (1)

Getting Started in R - Online Course

R is a free software environment for statistical computing and graphics, available for Windows, Linux, Unix and Macintosh systems. This course introduces participants to the use of R: the basics of how to write simple scripts in R, read and manipulate data, create graphics, do statistical analyses and very basic programming are covered. Practicals form an important part of the course. Our online course consists of self-learning supported by videos, practicals and other supplementary material. It is structured in 4 modules with an interactive session with a BioSS tutor after each module.

No prior knowledge of R is required, but you will need to have R and RStudio installed on your computer before starting the course. Note that this course is an introductory course about learning to work in R, it is not a course about learning basic statistics. The final module of the course (Module 4) illustrates how to carry out t-tests, regression and ANOVA, but it does not discuss in detail their use and interpretation. For those who do not have existing knowledge of basic statistics we recommend that you take Modules 1-3 of this course, and then take the BioSS "Basic Statistics in R" course.

Courses Held In-person (11)

Getting Started in R

R is a free software environment for statistical computing and graphics, available for Windows, Linux, Unix and Macintosh systems.

This course introduces participants to the use of R on Windows systems. The basics of how to install R, write simple scripts, read and manipulate data, create graphics, do statistical analyses and very basic programming are covered. Practicals form an important part of the course. No previous experience with R is assumed.

Basic Statistics in R *

This two day course introduces the important ideas in statistics and data analysis. It assumes that participants have no previous knowledge of statistics and data analysis, or have not used them for some time, or need a reminder of the basic ideas and interpretation.

Emphasis is placed on exploratory methods for examining data prior to analysis, using graphs, tables and summary statistics. The course then progresses to cover the elementary aspects of estimation and testing, including

  • One and two-sample t-tests
  • Confidence intervals
  • One- and two-way analysis of variance
  • Simple linear regression

The course includes several practical sessions that provide an opportunity to discuss the theory and try out the methods. The practical exercises make use of computer software called R, which is free and open source and offers users many standard statistical methods in its most basic form with vast libraries of code available to download to increase functionality.  Whilst the basics of how to use this package are covered, we do advise that attendees of Basic Statistics in R have attended the BioSS Getting Started in R course as they will need to write simple code. If participants cannot attend this course we can provide access to the online materials for self-study.

These courses also form a good foundation for the more advanced courses, which cover specific topics in statistics.

* BioSS also offers this course using Genstat

Experimental Design and Analysis

Experimentation is fundamental to much scientific and engineering research. Careful consideration of design is needed to make best use of the experimental resources available. Appropriate analysis reveals the conclusions that can be drawn from the resulting data.

This two day course covers the important topics in design and analysis, including randomisation, replication, blocking, factorial treatment structures, use of covariates and choice of design. Analysis of variance is used to interpret experimental results.

This course is suitable for scientists who have a good working knowledge of basic statistical concepts but have little or no experience of collecting and analysing data in more complex situations. To benefit from the course delegates should be familiar with the ideas of estimation, including standard errors and confidence intervals, and understand the rationale and output of standard statistical tests such as the t-test.

A Basic Statistics course is available if you feel you need to increase or refresh your knowledge of statistics before embarking on this course. More specialised courses in this area, including Statistical Methods for Repeated Measures Data and Introduction to Mixed Models and REML, are also available. If you are in any doubt about which course would be most appropriate for your needs please contact Graham Horgan.

Regression and Curve Fitting

Regression is used to investigate and quantify the relationships between variable quantities. This two-day course begins with a careful examination of the simplest case, linear regression, and then progresses to include

  • Regression with several explanatory variables
  • Non-linear regression (exponential and growth curves)
  • General non-linear modelling
  • Generalised linear models

The notes also cover the ideas of Generalized Additive Models which interpret trends in data by direct smoothing rather than by fitting parametric curves.

Introduction to Mixed Models

Mixed models are used when data have a complex structure with random variation occurring at different levels. REML provides a method to estimate how much variability is due to each level, and the extent to which factors and covariates of interest affect the outcome variables. This course introduces the use of mixed models and the topics covered include :-

  • When to use mixed models
  • Fixed and random effects
  • Choosing a model of variability
  • Estimating and testing fixed and random effects
  • Mixed models and regression / covariates
  • Modelling dependency among observations
  • Generalised linear mixed models

Practical sessions in which data (including any provided by the participants) are analysed using Genstat are an important part of the course. Participants should be familiar with Basic Statistics including ANOVA and simple linear regression.

Statistical Methods for Repeated Measures Data

Repeated measures data arise whenever several measurements of the same variable are made on each subject of study, usually at different times. A range of methods for studying such data are described, and the situations where each may be used are discussed. The topics covered in this 1.5 day course are

  • Plotting and displaying repeated measures data
  • Analysis of summaries
  • Split plot analysis
  • Multivariate analysis of variance
  • Antedependence modelling
  • Design of repeated measures experiments

Graphical Methods for Multivariate Data

Multivariate data arise whenever several variables are recorded on each subject of study. Many methods are now available for studying the structure and patterns in such data. This two-day course will cover the most useful of these, including

  • Principal components analysis and biplots
  • Canonical variates and discriminant analysis
  • Principal coordinates analysis
  • Multidimensional scaling
  • Classification techniques.
  • Clustering

Association Mapping using R

This course will introduce the basic concepts of Association Mapping. The course will equip participants with the necessary information and software to conduct an Association Mapping Analysis on their own data, highlighting areas that need to be considered such as accounting for population structure and relationships between individuals. The course is interactive with practical examples using the software R. It assumes participants are:

  • Familiar with the concept of a simple linear model
  • Proficient in R-software

Methods for Time Series Data

The aim of the course is to provide students with an advanced knowledge in time series analysis, from both a theoretical and a practical perspective. 

Course content: (1) Introduction to temporal correlated data: examples, terminology, and objectives of time series analysis; (2) Simple descriptive statistics: stationary time series, time plots, transformations; (3) Probability models for time series: moving averages models, autoregressive models, mixed models; (4) Estimation in the time domain; (5) Forecasting.

By the end of the course, students will be able to: (1) achieve advanced knowledge of the main statistical methods in linear time series analysis; (2) visualise temporal data sets; formalise data problems within a statistical framework; develop an implementation plan; apply inferential tools and identify trends, structures, and patterns in time series data; produce accurate temporal predictions; (3) evaluate the goodness of fit of a model and detect violations of the model assumptions; interpret the results of an empirical analysis; (4) use appropriate statistical language and describe the model assumptions; communicate the results of empirical statements  (5) independently develop and implement time series models and analyse real data sets.

Practical demonstrations of some examples will be shown via the software R (attendees are invited to install the R packages “TSA”, “tseries”, “astsa”, “lmtest”).

Bayesian Methods for Data Analysis

This short course will summarise the basic concepts of Bayesian statistics with a specific focus on modelling. Central to the Bayesian philosophy is the recognition that not only do the data possess a distribution, but also the unknown parameters of the model. In this approach, the data are described by the likelihood and the parameters by their prior distributions. The combination of these quantities gives rise to the posterior distribution, and obtaining and interpreting this is the main objective of Bayesian inference. Prior distributions are subjective descriptions of personal beliefs in the unknown parameter values, based on the researchers’ past experience and/or experts’ opinion and intuition, whereas the posterior distribution is based on these prior distributions modified by the new observed data. 

 

Examples of Bayesian modelling will be presented during the course. In modern Bayesian inference and model choice, the posterior density is approximated by computer-intensive methods based on numerical integration, performed mainly by Markov chain Monte Carlo (MCMC) algorithms, which will be gently presented as well. Practical demonstrations of some MCMC examples will be shown via the software R (attendees are invited to install the R packages “Boom” and “MASS”).

Generalised Additive Models

The course will introduce basic GAM theory and application using the popular ‘mgcv’ package in R. We will also touch on modelling spatio-temporal data using GAMs. 

Brief outline: 

*     Limitations of linear models and generalized linear models (GLMs) and when to use a GAM 

*     Smoothing in one dimension 

*     Basic GAM theory 

*     Computing predictions and variance 

*     Model checking and selection 

*     Smoothing in multiple dimensions (space and time )

Prerequisites: a working knowledge of the R programming language and familiarity with linear models and preferably also generalized linear models.