Framing
Framing is that technique to frame problem
Framing
Framing is the process of defining the problem or the task you’re aiming to solve! Framing a problem is crucial in determining what data to collect, what metrics to use, how to split the data and which algorithms should be appropriate to evaluate the model’s performance.
Define the Objectives
The first part of framing is to define what are the objectives. Think about Walking Backwards famous book where it explain how to get into a solution starting from the final result.
Specify the Input and Output
Input and Output are the basic of any Machine Learning model.
Input(s)
These are the features or variables that the model will use to make prediction. Normally they are represented with the symbol: X’s
Output(s)
These are what we’re trying to predict. Normally they are represented with Y’s
Type of Machine Learning Task
Depending what you’re trying to achieve you need to understand which type of machine learning problem can help you find a good solution
Supervised Learning:
When you have input-output pairs in your data. The labelled example will be used to train your model, and the unlabelled to make prediction.
Labelled vs Non-Labelled examples
labeled example: {input, output}: {features, label}: (x,y)
unlabeled example: {input, ?}: {features, ?}: (x,?)
Unsupervised Learning
You don’t have labeled outputs.
Reinforcement Learning
The models is trained with the reward/penalisation approach.
Semi-Supervised or Transfer Learning
Wheen you only have a small batch of labelled data, or you’re transferring knowledge from on task/domain to another.
Evaluation Metrics
Determine how you mesure the success of your model:
Regression problem:
A regression model predicts continious values. MAE (Mean Absolute Error) or RMSE (Root Mean Square Error).
Classification Problem:
A classification model predicts discrete values. Accuracy, precision, recall, F1-score, etc.
Data consideration
Think about
- Type of data you need.
- How much data you need.
- Quality and Revelance of the data.
- Potential biases
Constraints
- Real Time requirements
- Computational Resources
- Data Privacy and Security.