PRO fit_linear_1dvec, fx, x0, vdata, weights, coeffs, coeffs_err, model, chisq, status, danidl_cppcode, ERRBAR = errbar, ITERATION_PAR = iteration_par, $
; Description: This module fits a two-parameter linear model (or line) to a one-dimensional pixel vector of
; input data "vdata". The model used is:
;
; y = m*fx*(x - x0) + c
;
; where x represents the pixel coordinates, fx is the x coordinate scale factor, x0 is the x
; coordinate offset, y represents the pixel values with data "vdata", and m and c are the
; coefficients of the linear model. The values of fx and x0 are specified by the input parameters
; "fx" and "x0" respectively. If required, the fit can be weighted via the parameter "weights".
; The fitted linear coefficients are returned via the parameter "coeffs" along with the status
; of the fit "status".
; In the case that the weights of the fit "weights" represent the inverse variance weights
; for the vector data "vdata", then the uncertainties on the fitted parameters are represented
; by "coeffs_err", and the chi squared of the fit is represented by "chisq".
; The module also provides the option of treating the weights as representing the 1-sigma
; uncertainties on the vector data "vdata", in which case the actual weights that are used in
; the fit are the inverse variance weights, and again the uncertainties on the fitted parameters
; are represented by "coeffs_err", and the chi squared of the fit is represented by "chisq".
; For further flexibility, the module provides the option of iterating the fitting process. By
; setting the keyword ITERATION_PAR to an IDL structure that contains the six tags "method",
; "sigma", "maxrej", "permanent", "tol" and "maxiter" defined below (see under "Keywords"), the
; module will do the following.
; If "sigma" is positive, then the sigma-clip threshold S is taken to be the value of "sigma".
; If "sigma" is negative or zero, then the sigma-clip threshold S is calculated as sqrt(ln(Ngood)),
; where Ngood is the number of data values in "vdata" with positive weights. This expression for
; the sigma-clip threshold S may be derived from considerations of the Bayesian information
; criterion (BIC).
; If the tag "method" is set to "abs_resid", then, during each iteration of the fitting process,
; all data values with absolute residuals greater than S times the standard deviation of the
; residuals are flagged as candidate outliers. If the tag "method" is set to "norm_abs_resid",
; then, during each iteration of the fitting process, all data values with normalised absolute
; residuals greater than S, where the normalising factors are the square roots of the corresponding
; weights, are flagged as candidate outliers.
; If the tag "maxrej" is set to an INTEGER/LONG type positive number, then this number is
; considered to be the maximum number of data values that can be rejected in any one iteration,
; and consequently, only this number of candidate outliers with the largest residuals are
; rejected. If the tag "maxrej" is set to a FLOAT/DOUBLE type positive number that is less than
; or equal to "1.0", then this number is considered to be the maximum fraction of data values
; that can be rejected in any one iteration, and consequently, only the corresponding number of
; candidate outliers with the largest residuals are rejected. If the tag "maxrej" is set to
; anything else, then there is no limit on the number of data values that can be rejected in any
; one iteration.
; If the tag "permanent" is set to "1", then data values that are rejected will remain
; rejected for all subsequent iterations, meaning that the set of data values used at each
; iteration will only diminish. If the tag "permanent" is set to any other value, then data
; values will not be permanently rejected so that if a data value is initially rejected, then it
; may be subsequently included in the fitting process during the later iterations. This protects
; against non-outlier data values being permanently excluded from the calculation of the fit
; parameters when real outlier data values skew the parameter estimates during the early
; iterations.
; The iterations stop when the maximum fractional change in the fit parameters is less than or
; equal to "tol", or when there are not enough good data values to be used in the fit. To protect
; against the possibility of the fitting process not converging (or entering into an oscillatory
; regime), a limit of "maxiter" iterations is placed on the number of iterations that are
; performed.
; Useful extra information about the fitting process is returned via the keyword EXTRA_INFO,
; and the subscripts of the data values that were used in the (final iteration of the) fit are
; returned via the keyword DATA_USED_SUBS.
; This module will use C++ code if possible to achieve a faster execution than the default IDL
; code. The C++ code can be up to a factor of ~1.04 times faster than the IDL code in this case.
;
; Input Parameters:
;
; fx - FLOAT/DOUBLE - The x coordinate scale factor as defined above. This parameter must be non-zero.
; x0 - FLOAT/DOUBLE - The x coordinate offset (pix) as defined above.
; vdata - FLOAT/DOUBLE VECTOR - A one-dimensional vector of pixel values to which the linear model is to be
; fitted.
; weights - FLOAT/DOUBLE VECTOR - A one-dimensional vector of the same length as "vdata" which contains the
; weights to be used in the calculation of the fitted linear model. If this
; parameter does not have the correct number type or the same length as
; "vdata", then all pixel values are weighted equally. Pixel values to be
; ignored can be pre-flagged by setting the corresponding weights in this
; vector to zero (or negative values). If the keyword ERRBAR is set, then
; the module will assume that the values in this parameter represent the
; 1-sigma uncertainties on the pixel values, and it will calculate the actual
; weights to be used in the fit as the inverse variance weights.
; danidl_cppcode - STRING - The full directory path indicating where the DanIDL C++ code is installed. The
; shared library of DanIDL C++ routines should exist as
; "/dist/libDanIDL.so" within the installation.
;
; Output Parameters:
;
; coeffs - DOUBLE VECTOR - A two-element one-dimensional vector that contains the fitted coefficients m
; and c.
; coeffs_err - DOUBLE VECTOR - A one-dimensional vector of the same length as "coeffs" that contains the
; uncertainties on the fitted coefficients m and c. The elements of this
; output parameter are only meaningful if the input weights "weights" represent
; the inverse variances for the vector data "vdata", or if the keyword
; ERRBAR is set (see below) and the weights represent the 1-sigma uncertainties
; on the vector data "vdata".
; model - DOUBLE VECTOR - A one-dimensional vector of the same length as "vdata" representing the fitted model
; pixel values for the vector data "vdata".
; chisq - DOUBLE - The chi squared of the fit ignoring bad data values flagged in "weights" and rejected data
; values identified during the iterative fitting process. This quantity only represents the
; chi squared in the case where the input weights "weights" represent the inverse variances
; for the vector data "vdata", or if the keyword ERRBAR is set (see below) and the weights
; represent the 1-sigma uncertainties on the vector data "vdata".
; status - INTEGER - If the fit parameters were calculated successfully and no iterations were performed, then
; "status" is returned with a value of "4". If the fit parameters were calculated
; successfully and the iterations were stopped because the algorithm tolerance was achieved,
; then "status" is returned with a value of "3". If the fit parameters were calculated
; successfully but the iterations were stopped because the maximum number of iterations
; "maxiter" had been performed, then "status" is returned with a value of "2". If the fit
; parameters were calculated successfully but the iterations were stopped because there
; were not enough good data values to be used in the next iteration, then "status" is
; returned with a value of "1". If the fit parameters were not calculated successfully,
; then "status" is returned with a value of "0".
;
; Keywords:
;
; If the keyword ERRBAR is set (as "/ERRBAR"), then the module will assume that the weights "weights" represent
; the 1-sigma uncertainties on the pixel values "vdata", and it will calculate the actual weights to be used in
; the fit as the inverse variance weights.
;
; If the keyword ITERATION_PAR is set to an IDL structure with the six tags described below, then the module
; will iterate the fitting process (as described under "Description"). The tags that must be present are:
;
; (i) method - STRING - This tag has two acceptable values. If "method" is set to "abs_resid", then, during
; each iteration of the fitting process, all data values with absolute residuals
; greater than S times (see "sigma") the standard deviation of the residuals are
; flagged as candidate outliers. If "method" is set to "norm_abs_resid", then, during
; each iteration of the fitting process, all data values with normalised absolute
; residuals greater than S, where the normalising factors are the square roots of the
; corresponding weights, are flagged as candidate outliers.
;
; (ii) sigma - FLOAT/DOUBLE - This tag specifies the sigma-clip rejection threshold to be used. If "sigma"
; is positive, then the sigma-clip threshold S is taken to be the value of
; "sigma". If "sigma" is negative or zero, then the sigma-clip threshold S is
; calculated as sqrt(ln(Ngood)), where Ngood is the number of data values in
; "vdata" with positive weights. This expression for the sigma-clip threshold
; S may be derived from considerations of the Bayesian information criterion
; (BIC).
;
; (iii) maxrej - INTEGER/LONG or FLOAT/DOUBLE - If this tag is set to an INTEGER/LONG type positive number,
; then the value of this tag is considered to be the maximum
; number of data values that can be rejected in any one
; iteration. If this tag is set to a FLOAT/DOUBLE type
; positive number that is less than or equal to "1.0", then
; the value of this tag is considered to be the maximum
; fraction of data values that can be rejected in any one
; iteration. If this tag is set to anything else, then there
; is no limit on the number of data values that can be
; rejected in any one iteration.
;
; (iv) permanent - INTEGER/LONG - If this tag is set to "1", then data values that are rejected will remain
; rejected for all subsequent iterations, meaning that the set of data
; values used at each iteration will only diminish. If this tag is set to
; any other value, then data values will not be permanently rejected so that
; if a data value is initially excluded then it may be subsequently included
; in the fitting process during the later iterations.
;
; (v) tol - FLOAT/DOUBLE - This tag specifies the tolerance of the iterative fitting process. The iterations
; will stop if the maximum fractional change in the fit parameters between subsequent
; iterations is less than or equal to "tol". This parameter must have a value
; greater than or equal to "0.0", and less than "1.0". A value for "tol" of "0.0"
; means that the iterations will continue until none of the fit parameters change
; from one iteration to the next.
;
; (vi) maxiter - INTEGER/LONG - This tag specifies the maximum number of iterations to be performed before
; returning the fit parameters. If this parameter is set to a negative value
; or zero, then there will be no constraint on the maximum number of
; iterations that may be performed.
;
; If the keyword EXTRA_INFO is set to a named variable, then, on return, this variable will contain an IDL
; structure with the following tags:
;
; (i) ndata - LONG - The number of data values in the input parameter "vdata".
; (ii) ndata_good - LONG - The initial number of data values with positive weights.
; (iii) ndata_used - LONG - The number of data values used in the calculation of the final fit parameters.
; (iv) niter - LONG - The number of iterations performed in the fitting process.
;
; If the keyword DATA_USED_SUBS is set to a named variable, then, on return, this variable will contain a
; VECTOR of LONG type numbers representing the subscripts for the data values that were used in the
; calculation of the final fit parameters.
;
; N.B: (i) In DanIDL, I adopt a pixel coordinate system such that the origin of the coordinate system is at
; the left hand side of the left hand pixel in a vector. Therefore, the centre of the left hand pixel
; in a vector has an x coordinate of 0.5 in this coordinate system. For each full pixel moved to the
; right in the vector, the x pixel coordinate increments by 1.
;
; (ii) For easier code maintainability with respect to the iterative schema, I employ the DanIDL module
; "scale_pattern.pro" to execute the fitting process, although this comes at some cost to efficiency
; and speed given the relatively trivial nature of the least squares problem.
;
; Author: Dan Bramich (dan.bramich@hotmail.co.uk)
;
; History:
;
; 08/05/2010 - Module created (dmb).
;Set the default output parameter values
;Check that "vdata" is a one-dimensional vector of the correct number type
;Calculate the vector pixel coordinates
;Prepare the required pattern array
;If "weights" is a one-dimensional vector of the same length as "vdata"
;Fit the linear model to the input data using the weights "weights" and return
;If "weights" is not a one-dimensional vector of the same length as "vdata", then fit the linear model to the input data
;with equal weights