Curve Fit Reference

Curve Fits are a key feature for Yob. They are setup to work well "right out of the box," but they are also highly customizable. This guide is meant to serve as a reference for using Curve Fits. If you are new using Curve Fits, you may want to check out one of the following tutorials first:


General Rules

  • At a minimum, Curve Fits need to know which Data Set to use, and what type of Model to fit to the data.
  • Curve Fits need sufficient data to work properly. In general, complex models need more data than simple models.

Built-in Models

Curve Fits come with the following built-in models:

Model NameExpression
Constantcf(x) = C
Proportionalcf(x) = A*x
Linearcf(x) = A*x + B
Quadraticcf(x) = A*x^2 + B*x + C
Cubiccf(x) = A*x^3 + B*x^2 + C*x + D
Quarticcf(x) = A*x^4 + B*x^3 + C*x^2 + D*x + F
Quinticcf(x) = A*x^5 + B*x^4 + C*x^3 + D*x^2 + F*x + G
Powercf(x) = A*x^B
Inversecf(x) = A / x
Inverse Squarecf(x) = A / (x^2)
Sinusoidcf(x) = A*sin(B*x + C) + D
Exponentialcf(x) = A*B^x
Natural Exponentialcf(x) = A*e^(B*x)
Logarithmiccf(x) = A + B*log(x)
Natural Logarithmiccf(x) = A + B*ln(x)
Gaussiancf(x) = A*e^(-(x-B)^2 / C^2) + D
Normalized Gaussiancf(x) = (1 / (S*sqrt(2*π))) * e^(-(x-M)^2 / (2*S^2))
Logisticcf(x) = A / (1 + e^(-B*x + C))

Custom Models

Curve Fits also support user-defined models. Note that all the rules of Yob expressions apply. See the Expression Reference for more details.


Regression Methods

Curve fits attempt to find the best model parameters by minimizing the sum of the differences between the outputs and predicted outputs (residuals) in some form or another.

Yob supports several different methods of solving this problem, but by far the most common (and default) is the Ordinary Least Squares (OLS) algorithm. More details for each method are described below.

OLS - Ordinary Least Squares

The Ordinary Least Squares algorithm is the most robust and general purpose algorithm, making it suitable for almost any data. It finds the best model parameters by minimizing the sum of the squared residuals. In mathematical terms, it tries to minimize OLS Objective. Since the residuals are squared, points that are further away from the curve are more harshly penalized than points that are close.

LAD - Least Absolute Deviations

The Least Absolute Deviations algorithm attempts to minimize the sum of the absolute residuals, as opposed to the squared residuals like OLS. In mathematical terms, it tries to minimize LAD Objective. Since the residuals are not squared, this algorithm can handle outliers marginally better than OLS, however, unlike OLS, there are some cases where multiple solutions exist which can cause instability.

WLS_1Y - Weighted Least Squares, 1 / Y

Both OLS and LAD assume that error is distributed evenly throughout the data, which is not always the case. Take a look at the following data for example:

You can see that the error gets larger as Y gets larger. The Weighted Least Squares algorithm is a variation on OLS that addresses this problem by weighting each residual so that it pays extra attention to data points that have small Y values, and pays less attention to points with large Y values, since it is assumed that they have greater amounts of error. In mathematical terms, the algorithm tries to minimize WLS Objective. For this particular version of WLS, Wi = 1/y. The use cases of this algorithm are more limited than OLS, but recoginzing when it's appropriate to apply it can make a big difference in your experiment. This is demonstrated graphically in the next variation of WLS below.

WLS_1Y2 - Weighted Least Squares, 1 / Y2

This variation on WLS functions just like the previous one, the only difference being that Wi = 1/y^2. The graph below shows the differences between a line fitted with OLS and a line fitted with WLS, using the data from the previous example:


Guess Parameters

The curve fitting algorithms in Yob work by taking an intial set of parameters (or guess parameters) for a model, and iteratively improving them until the optimal parameters are found. The following example illustrates this concept:

Suppose we have some data that we are trying to fit a sinusoid model to. The sinusoid model is expressed by A*sin(B*x + C) + D, which means that the model parameters are A, B, C, and D. If we pick a few reasonable values for these parameters we might get a curve like the green dashed line below. Using this starting point, Yob will repeatedly try new values for the parameters that are in the neighborhood of the intial values. If the new values result in a curve that better fits the data, Yob will use those values as the new starting point and continue searching for better ones.

When Yob can no longer improve the model, Yob will display the fitted model in the graph preview (here it is the solid blue line), and display the optimal parameter values in the Parameter Output section.

Manual vs Automatic

By default, Curve Fits automatically generate guess parameters for you. However, if you wish to supply your own guess parameters, you can check the Manual Guess check box and edit the number fields below it:

Note

For all of the built-in models, you shouldn't have to worry about picking guess parameters since Yob automatically picks good guesses for them. However, since picking guess parameters requires extra knowlege about the behavior of the model, Yob cannot automatically pick guess parameters for custom models. (At least, not good ones.)


Parameter Output

The fitted parameters of a Curve Fit can be found in the Parameter Output section. The example below shows the parameters for a quadratic model:

Along with the fitted parameters, various metrics for goodness of fit are presented, including the Root Mean Square Error (RMSE), the Coefficient of Determination (R2), and the Correlation Coefficient (R).