main function for parameter estimation
estim_param(
obs_list,
crit_function = crit_log_cwss,
model_function,
model_options = NULL,
optim_method = "nloptr.simplex",
optim_options = NULL,
param_info,
forced_param_values = NULL,
candidate_param = NULL,
transform_var = NULL,
transform_obs = NULL,
transform_sim = NULL,
satisfy_par_const = NULL,
var = NULL,
info_level = 1,
info_crit_func = list(CroptimizR::BIC, CroptimizR::AICc, CroptimizR::AIC),
weight = NULL,
var_names = lifecycle::deprecated()
)
List of observed values to use for parameter estimation.
A named list
(names = situations names) of data.frame containing
one column named Date with the dates (Date or POSIXct format) of the different observations
and one column per observed variables with either the measured values or NA, if
the variable is not observed at the given date.
See details section for more information on the list of observations actually
used during the parameter estimation process.
Function implementing the criterion to optimize (optional, see default value in the function signature). See here for more details about the list of proposed criteria.
Crop Model wrapper function to use.
List of options for the Crop Model wrapper (see help of the Crop Model wrapper function used).
Name of the parameter estimation method to use (optional, see default value in the function signature). For the moment, can be "simplex" or "dreamzs". See here for a brief description and references on the available methods.
List of options of the parameter estimation method, containing:
out_dir
Directory path where to write the optimization results (optional, default to getwd()
)
ranseed
Set random seed so that each execution of estim_param give the same
results when using the same seed. If you want randomization, set it to NULL,
otherwise set it to a number of your choice (e.g. 1234) (optional, default to NULL, which means random seed)
specific options depending on the method used. Click on the links to see examples with the simplex and DreamZS methods.
path_results
path_results
is no longer supported, use out_dir
instead.
Information on the parameters to estimate. Either a list containing:
ub
and lb
, named vectors of upper and lower bounds (-Inf and Inf can be used if init_values is provided),
init_values
, a data.frame containing initial
values to test for the parameters (optional, if not provided, or if less values
than number of repetitions of the minimization are provided, the, or part
of the, initial values will be randomly generated using LHS sampling within
parameter bounds).
or a named list containing for each parameter:
sit_list
, list the groups of situations for which the current estimated
parameter must take different values (see here
for an example),
ub
and lb
, vectors of upper and lower bounds (one value per group),
init_values
, the list of initial values per group (data.frame, one column per group, optional).
Named vector or list, must contain the values (or
arithmetic expression, see details section) for the model parameters to force. The corresponding
values will be transferred to the model wrapper through its param_values argument
during the estimation process.
Should not include values for estimated parameters (i.e. parameters defined in
param_info
argument), except if they are listed as candidate parameters (see
argument candidate_param
).
Names of the parameters, among those defined in the argument param_info, that must only be considered as candidate for parameter estimation (see details section).
Named vector of functions to apply both on simulated and
observed variables. transform_var=c(var1=log, var2=sqrt)
will for example
apply log-transformation on simulated and observed values of variable var1,
and square-root transformation on values of variable var2.
User function for transforming observations before each criterion evaluation (optional), see details section for more information.
User function for transforming simulations before each criterion evaluation (optional), see details section for more information.
User function for including constraints on estimated parameters (optional), see details section for more information.
(optional) List of variables for which the model wrapper must return results. By default the wrapper is asked to simulate only the observed variables. However, it may be useful to simulate also other variables, typically when transform_sim and/or transform_obs functions are used. Note however that it is active only if the model_function used handles this argument. If it is the case, and if the var argument is provided, then the list of observations used will be restricted to the list of variables given in the var argument, plus the ones possibly computed by the transform_sim function.
(optional) Integer that controls the level of information returned and stored by estim_param (in addition to the results automatically provided that depends on the method used). Higher code give more details.
0
to add nothing,
1
to add criterion and parameters values, and constraint if satisfy_par_const is provided, for each evaluation
(element params_and_crit in the returned list),
2
to add model results, after transformation if transform_sim is provided, and after intersection with observations,
i.e. as used to compute the criterion for each evaluation (element sim_intersect in the returned list),
3
to add observations, after transformation if transform_obs is provided, and after intersection with simulations,
i.e. as used to compute the criterion for each evaluation (element obs_intersect in the returned list),
4
to add all model wrapper results for each evaluation, and all transformations if transform_sim is provided.
(elements sim and sim_transformed in the returned list).
Function (or list of functions) to compute information criteria. (optional, see default value in the function signature and here for more details about the list of proposed information criteria.). Values of the information criteria will be stored in the returned list. In case parameter selection is activated (i.e. if the argument candidate_param is defined (see details section)), the first information criterion given will be used. ONLY AVAILABLE FOR THE MOMENT FOR crit_function==crit_ols.
Weights to use in the criterion to optimize. A function that takes in input a vector of observed values and the name of the corresponding variable and that must return either a single value for the weights for the given variable or a vector of values of length the length of the vector of observed values given in input.
prints, graphs and a list containing the results of the parameter estimation,
which content depends on the method used and on the values of the info_level
argument.
All results are saved in the folder optim_options$out_dir
.
In CroptimizR, parameter estimation is based on the comparison between the values
of the observed and simulated variables at corresponding dates. Only the situations,
variables and dates common to both observations (provided in obs_list
argument),
and simulations returned by the wrapper used, will be taken into account in
the parameter estimation process.
In case where the value of an observed variable is NA for a given situation and
date, it will not be taken into account. In case where the value of a simulated
variable is NA (or Inf) for a given situation and date for which there is an
observation, the optimized criterion will take the NA value, which may stop the
process, and the user will be warned.
If the candidate_param argument is given, a parameter selection procedure following the AgMIP calibration phaseIII protocol will be performed: The candidate parameters are added one by one (in the given order) to the parameters that MUST be estimated (i.e. the one defined in param_info but not in candidate_param). Each time a new candidate is added:
the parameter estimation is performed and an information criterion is computed (see argument info_crit_func)
if the information criterion is inferior to all the ones obtained before, then the current candidate parameter is added to the list of parameters to estimate
The result includes a summary of all the steps (data.frame param_selection_steps).
The optional argument transform_obs
must be a function with 4 arguments:
model_results: the list of simulated results returned by the mode_wrapper used
obs_list: the list of observations as given to estim_param function
param_values: a named vector containing the current parameters values proposed by the estimation algorithm
model_options: the list of model options as given to estim_param function
It must return a list of observations (same format as obs_list
argument) that
will be used to compute the criterion to optimize.
The optional argument transform_sim
must be a function with 4 arguments:
model_results: the list of simulated results returned by the mode_wrapper used
obs_list: the list of observations as given to estim_param function
param_values: a named vector containing the current parameters values proposed by the estimation algorithm
model_options: the list of model options as given to estim_param function
It must return a list of simulated results (same format as this returned by the model wrapper used) that will be used to compute the criterion to optimize.
The optional argument satisfy_par_const
must be a function with 2 arguments:
param_values: a named vector containing the current parameters values proposed by the estimation algorithm
model_options: the list of model options as given to estim_param function
It must return a logical indicating if the parameters values satisfies the constraints (freely defined by the user in the function body) or not.
The optional argument forced_param_values
may contain arithmetic expressions to
automatically compute the values of some parameters in function of the values of
parameters that are estimated (equality constraints). For that, forced_param_values
must be a named list. Arithmetic expressions must be R expressions given under the
shape of character strings. For example:
forced_param_values = list(p1=5, p2=7, p3="5*p5+p6")
will pass to the model wrapper the value 5 for parameter p1, 7 for parameter p2,
and will dynamically compute the value of p3 in function of the values of parameters
p5 and p6 iteratively provided by the parameter estimation algorithm. In this example,
the parameters p5 and p6 must thus be part of the list of parameters to estimate, i.e.
described in the param_info
argument.
For more details and examples, see the different vignettes in CroptimizR website