Framing

Framing is that technique to frame problem

Framing

Framing is the process of defining the problem or the task you’re aiming to solve! Framing a problem is crucial in determining what data to collect, what metrics to use, how to split the data and which algorithms should be appropriate to evaluate the model’s performance.

Define the Objectives

The first part of framing is to define what are the objectives. Think about Walking Backwards famous book where it explain how to get into a solution starting from the final result.

Specify the Input and Output

Input and Output are the basic of any Machine Learning model.

Input(s)

These are the features or variables that the model will use to make prediction. Normally they are represented with the symbol: X’s

Output(s)

These are what we’re trying to predict. Normally they are represented with Y’s

Type of Machine Learning Task

Depending what you’re trying to achieve you need to understand which type of machine learning problem can help you find a good solution

Supervised Learning:

When you have input-output pairs in your data. The labelled example will be used to train your model, and the unlabelled to make prediction.

Labelled vs Non-Labelled examples

labeled example: {input, output}: {features, label}: (x,y)
unlabeled example: {input, ?}: {features, ?}: (x,?)

Unsupervised Learning

You don’t have labeled outputs.

Reinforcement Learning

The models is trained with the reward/penalisation approach.

Semi-Supervised or Transfer Learning

Wheen you only have a small batch of labelled data, or you’re transferring knowledge from on task/domain to another.

Evaluation Metrics

Determine how you mesure the success of your model:

Regression problem:

A regression model predicts continious values. MAE (Mean Absolute Error) or RMSE (Root Mean Square Error).

Classification Problem:

A classification model predicts discrete values. Accuracy, precision, recall, F1-score, etc.

Data consideration

Think about

Type of data you need.
How much data you need.
Quality and Revelance of the data.
Potential biases

Constraints

Real Time requirements
Computational Resources
Data Privacy and Security.