Optimization Problem Guidelines

This document provides instructions on the accepted formats for optimization and documents for the Optimization use case. Assigning proper feature groups are crucial for getting optimal solutions.

Feature group specifications

Optimization use case helps us solve integer programming problems that can be used for scheduling, resource allocation, finding shortest path etc. The important feature groups of Optimization are Assignment FG and Constraint FG.

Assignment FG

Every optimization problem has an Assignment FG that describes all possible assignments in a schedule. There are 7 feature mappings :

ASSIGNMENT: This identifier is used for representing the variables that are assigned values when the optimization problem is solved. Can take integer, boolean and float entries. Every Assignment FG must have an assignment feature mapping.
BOUNDS: This column helps specify bounds for assignments and nature of assignments. Specified as STRUCT with entries lowerBound (int), upperBound (int), isInteger (bool)
OBJECTIVE: This feature mapping represents the cost of an assignment in the objective to be minimized. In the planning problem, the objective is typically to minimize costs related to a schedule or a plan. The data in this column would represent the costs for each assignment.
SELECTOR: This is the identifier for a column we want to use in the prediction Dashboard and we can assign specific values to them and get assignments for those queries. These columns aren't used in the model training process. Can be used for multiple columns.
HINT: This feature represents an approximate value of the predicted assignment. The values in this column serve as hints or initial values for the assignments in the planning solution. The actual solution can be different. Can be used for multiple columns.
FIX: This column represents the value of predicted assignments that are fixed, i.e., they are not going to change in the solution. Can be used for multiple columns.
METRICS: The columns that are marked as 'metrics' show up in the metrics page after model training along with the values they take. Can be used for multiple columns.

Constraint FG

Each of these datasets represents a constraint family that can be configured.

ASSIGNMENT: This represents an assignment that is present in the ASSIGNMENTS Feature Group. The ASSIGNMENT feature in the CONSTRAINTS feature group would reference the unique identifier of the assignment from the ASSIGNMENTS table.
GROUP: This represents the identifier for a single group of variables to be constrained. This is used to group together assignments that share a common constraint equation. So all assignments in a GROUP would have to collectively satisfy the given constraint.
COEFFICIENT: This represents the coefficient of the assignment term in the constraint group. In the context of a linear constraint, for example, each term in the linear equation is a product of a variable (an ASSIGNMENT) and a COEFFICIENT. The COEFFICIENT indicates how much weight that variable (or ASSIGNMENT) has in the constraint equation.

Adding Constraints

Every constraint table requires us to specify the inequality, constants, penalty for breaking the constraint (in case of infeasibility), enforcement (hard/soft). Consider it as the RHS of the inequality formed by the constraint FG.

Custom Table

This dataset describes all forced assignments in a schedule. So, when the model is learning from the data, it can learn to incorporate these forced assignments into any solutions it generates. This is particularly useful when there are hard constraints or specific requirements that absolutely must be taken into account, such as resource availability or certain tasks needing to be completed before others.

ASSIGNMENT : The assignment we want to specify a value for
ASSIGNMENT_VALUE : Value of to assign.

________________________________

Example: Stigler Diet Problem

The goal of the diet problem is to select a set of foods that will satisfy a set of daily nutritional requirement at minimum cost. The problem is formulated as a linear program where the objective is to minimize cost and the constraints are to satisfy the specified nutritional requirements. Here we are solving it for float assignments.

Input Dataset

Food: 77 food items and the cost wise normalised amount of all 9 nutrients in each of them.
Nutritional Requirement: 9 nutrients and their daily required amounts.

Assignment FG

The 'Food' dataset is used to create the assignment FG. The feature mappings are:

ASSIGNMENT : The column with names of food. The goal is to attach values to each food item to signify the daily amount required.
OBJECTIVE : Cost column is Objective. In this problem cost is 1 for each food because nutrition content is normalised wrt cost.
BOUNDS : Created a seperate counds column with lowerBound 0, upperBound 100, isInteger False to support float assignments.
all other columns can be either SELECTOR or METRICS.

Constraint FG

The idea to express the constraints as linear expressions of type: {sum for i=1 to n (a_i x b_i)} < inequality > < constant >.

We create a table with 78 x 9 rows. Each food item and nutrient pair has a seperate entry in this table. The columns are Commodity, Nutrient and Value. For each nutrient we added a row with Commodity as null, Nutrient is the name of the nutrient and Vlue is the negative of Dialy Nutritional Requirement
1. ASSIGNMENT : Commodity
2. GROUP : Nutrients. Each constraint equation is grouped by the common nutrient.
3. COEFFICIENT : Value ie, the nutrient for each assignment
4. Constraint inequality: Here we have used >= inequality and constant as 0.