Skip to content

Master's Thesis in Applied Machine Learning

This repository contains guidance for students doing a master's thesis supervised by Alex Jung at Aalto University.


Table of Contents

  1. What to Expect from Your Supervisor
  2. What Is Expected from You
  3. Getting Started
  4. Typical Timeline
  5. Recommended Development Environment
  6. Responsible Use of AI
  7. Practical Workflow
  8. Thesis Manuscript Preparation
  9. Typesetting Mathematical Texts
  10. Iterative Writing Process
  11. Final Thesis Checklist
  12. Thesis Presentation and Self-Evaluation
  13. Thesis Evaluation, Decision, and Appeals
  14. References
  15. Feedback and Questions

What to Expect from Your Supervisor

As your supervisor, you can expect me to:

  • Help you clearly define your ML problem.
  • Advise on suitable ML methods, tools, and resources.
  • Guide you through thesis writing and evaluation.
  • Provide feedback on drafts and self-assessments.
  • Offer the opportunity to discuss your self-assessment before submission.

What Is Expected from You

As a thesis student, you are expected to:

  • Take ownership of your research and drive progress independently.
  • Come to meetings prepared with concrete questions or results.
  • Communicate proactively — notify me early if you are stuck or falling behind schedule.
  • Use high-quality scientific references (peer-reviewed journals, reputable conferences, established textbooks).
  • Follow the writing and typesetting conventions described in this guide.

Getting Started

To start your thesis:

  1. Formulate your ML problem clearly by identifying data points, their features, and labels (watch this video).
  2. Choose suitable ML models that you are comfortable implementing (e.g., linear regression, random forests, or neural networks).
  3. Identify data sources and evaluation criteria (e.g., test accuracy, computational efficiency).

Detailed guidance is available in Chapter 2 of the textbook and in these lecture videos.


Typical Timeline

A master's thesis typically spans 6–12 months. Below is a rough breakdown:

Phase Activities
Problem Definition Identify research question, data sources, and evaluation criteria
Literature Review Survey related work; identify gaps your thesis addresses
Data Collection & Preprocessing Gather, clean, and explore your dataset
Modeling Implement and train ML models; run baseline experiments
Evaluation & Diagnosis Benchmarks, sensitivity analysis, error analysis
Writing Draft chapters iteratively; incorporate feedback
Self-Assessment & Presentation Complete evaluation form; prepare and deliver thesis presentation

Visual Studio Code is a good choice for thesis work — it supports LaTeX (via the LaTeX Workshop extension), Python, and Jupyter notebooks in one place.

The Claude Code extension for VS Code integrates an AI assistant directly into your editor. You can use it to:

  • Explain or refactor Python/LaTeX code in context
  • Get feedback on a selected paragraph or equation
  • Generate boilerplate (e.g., plotting code, pseudocode skeletons)
  • Ask questions about your codebase without leaving the editor

Note: Treat AI-generated content critically. Verify any code it produces, and do not use it to write substantive thesis text — your analysis and conclusions must be your own.


Responsible Use of AI

Aalto University has official policies on AI use in research and studies. You are expected to follow these:

The key principles are summarised below.

Authorship and Accountability

  • AI cannot be listed as an author. You bear full responsibility for every claim, result, and conclusion in your thesis.
  • Never use AI as a disclaimer — the fact that AI produced something does not excuse errors or misconduct.

Disclosure

  • Always disclose when and how you used AI tools. In a thesis, this belongs in a dedicated statement — not in the Methods section, which is reserved for your actual research methods.
  • Record which tool, version, and settings you used so that your process is transparent (exact reproduction via online services is often impossible as they update frequently).

Data Protection and IP

  • Do not upload personal data, confidential data, or unpublished manuscripts to public AI services — this may violate GDPR.
  • Be aware that AI-generated content may embed others' work without traceable references. Verify all citations independently.
  • For sensitive data, use local or GDPR-compliant AI tools only.

Permitted Uses in Thesis Work

AI tools can support your work without compromising integrity when used for:

  • Proofreading language and grammar
  • Brainstorming research directions or experiment designs
  • Explaining concepts or summarising background literature (always verify)
  • Generating boilerplate code or plotting templates (always review and test)
  • Identifying counterarguments to strengthen your reasoning

What AI Cannot Replace

  • Your own critical analysis and interpretation of results
  • Independent evaluation of source quality and relevance

Practical Workflow

A master's thesis in machine learning typically involves:

  • Data Collection and Preprocessing using, e.g., pandas.
  • Model Training and Validation using, e.g., scikit-learn.
  • Model Diagnosis using numerical experiments (benchmarks, sensitivity analysis) and, when appropriate, mathematical analysis (generalisation bounds, error analysis, comparison to Bayes' risk).

For academic sources, use:

  • Aalto University Library: https://primo.aalto.fi/discovery/search?vid=358AALTO_INST:VU1&lang=en
  • IEEE Xplore: https://ieeexplore.ieee.org
  • ACM Digital Library: https://dl.acm.org
  • Scopus: https://www.scopus.com
  • Web of Science: https://www.webofscience.com

To assess the quality of journals and conferences, consult the JUFO ranking system: https://jfp.csc.fi/jufoportal

If you are uncertain about a reference's quality, ask me.


Thesis Manuscript Preparation

When preparing your thesis, ensure:

  • Terminology: Use terms defined in the Aalto Dictionary of ML. You are encouraged to reuse its LaTeX source (e.g., TikZ figures).
  • Problem formulation: State clearly what the data points are and how their features and labels are defined.
  • Loss functions: Explicitly state the loss function used for training and, separately, for validation or testing.
  • Numerical results: Present and discuss results thoroughly to answer your research questions.
  • Baselines: Use appropriate baselines or benchmarks (e.g., Kaggle competitions).
  • Structure: Begin each chapter and section with an introductory paragraph explaining its content and its connection to the rest of the thesis.
  • Equations: Reference all numbered equations using \eqref{}. Only number equations that are referenced in the text; leave unreferenced equations unnumbered.
  • Algorithms: Present new methods as pseudocode (see examples).
  • Figures: Ensure all figures are clear, labelled, and have informative captions (caption guidelines).
  • References: Format according to IEEE guidelines.

For creating effective figures, see Edward Tufte's The Visual Display of Quantitative Information.


Typesetting Mathematical Texts

Display vs Inline Math

  • Use inline math ($...$) for short expressions within a sentence: The loss is defined as $L(\theta)$.
  • Use display math (\[ ... \] or the equation environment) for standalone equations that are central or referenced.

Punctuation with Displayed Equations

Punctuate displayed math as part of the surrounding sentence:

The empirical risk is defined as
\[
L(\theta) = \frac{1}{n} \sum_{i=1}^n \ell(f(x_i;\theta), y_i).
\]

Iterative Writing Process

  • While experimenting, keep a separate working notes file with the methods you try, the hyperparameters used, and the results — without worrying about prose, formatting, or citations. Treat it as a lab notebook.
  • Start serious writing with the literature review or methodology chapters. These stabilise earliest and rarely need to be rewritten when results shift.
  • Defer the results and discussion chapters until your experiments have converged — results often change substantially as experiments progress, and writing them up too early wastes effort.
  • Write the abstract last. It is the shortest section but depends on everything else being settled.
  • Incorporate feedback regularly from peers, group meetings, or, where appropriate, LLM-based tools.
  • Expect and budget for multiple revision rounds before submission.

Final Thesis Checklist

Before submitting, verify each item below. Links point to the relevant section of this guide.

  • ML problem is precisely formulated: data points, features, and labels are clearly defined (see Getting Started)
  • Loss functions for training and evaluation are explicitly stated (see Manuscript Preparation)
  • Methods are clearly described, including pseudocode for new algorithms
  • Baselines or benchmarks are included and discussed
  • All figures have labelled axes and informative captions
  • All numbered equations, tables, and figures are referenced in the text
  • Citations are formatted according to IEEE guidelines
  • Self-assessment form is completed (form here)

Thesis Presentation and Self-Evaluation

After completing the thesis manuscript:

  • Complete the detailed self-assessment (evaluation form), with explicit references to sections of your thesis.
  • Review the grade characterisation PDF to understand what constitutes a high-quality thesis.
  • Optionally, request a meeting to discuss your self-assessment before submission.
  • Prepare your thesis presentation — either live during a group meeting or as a recorded video (see examples).

Thesis Evaluation, Decision, and Appeals

YouTube overview

How the grade is decided

  • Your thesis is evaluated against the official programme and school-level criteria.
  • I prepare a written evaluation and a grade proposal based on these criteria.
  • The final grade is confirmed by the programme or school following Aalto University's formal procedures.
  • Study the grade characterisation document carefully before submission.

Transparency and feedback

  • You have the right to see the evaluation criteria applied to your thesis.
  • You may request clarification on how your thesis was assessed and how the grade was formed.
  • The self-assessment form is an important part of this process.

Appealing a grading decision

  • If you believe an error occurred, you have the right to request rectification of the grade.
  • Appeals must be submitted within 14 days of being notified of the grade or of being given the opportunity to review the evaluation.
  • See Academic Appeals at Aalto University for the formal procedure.

Practical advice: If unsure whether an appeal is appropriate, discuss the evaluation with your supervisor first. Many issues can be resolved without a formal appeal.


References

ML Fundamentals

Writing and Typesetting


Feedback and Questions

Reach out via: