Training sets¶
ecopann.cosmic_params¶
- class ecopann.cosmic_params.ParamsProperty(param_names, params_dict=None)[source]¶
Bases:
object- property labels¶
- property param_fullNames¶
- property params_limit¶
- ecopann.cosmic_params.params_dict_zoo()[source]¶
Information of cosmological parameters that include the labels and physical limits: [label, limit_min, limit_max]
The label is used to plot figures. The physical limits are used to ensure that the simulated parameters have physical meaning.
Note
If the physical limits of parameters is unknown or there is no physical limits, it should be set to np.nan.
ecopann.data_simulator¶
- class ecopann.data_simulator.AddGaussianNoise(spectra, params=None, obs_errors=None, cholesky_factor=None, noise_type='multiNormal', factor_sigma=0.5, multi_noise=5, use_GPU=True)[source]¶
Bases:
objectAdd Gaussian noise for simulated data.
- Parameters
spectra (torch tensor, or a list of torch tensor) – The simulated spectra (data) with shape (N, spectra_length), or a list of spectra with shape [(N,spectra_length_1), (N,spectra_length_2), …]
params (torch tensor or None) – The simulated cosmological parameters. Default: None
obs_errors (torch tensor, or a list of torch tensor, optional) – Observational errors (standard deviation) with shape (spectra_length,), or a list of errors with shape [(spectra_length_1,), (spectra_length_2,), …]. Default: None
cholesky_factor (torch tensor, a list of torch tensor, or None, optional) – Cholesky factor of covariance matrix with shape (spectra_length, spectra_length), or a list of Cholesky factor of covariance matrix with shape [(spectra_length_1, spectra_length_1), (spectra_length_2, spectra_length_2), …]. Default: None
noise_type (str, optional) – The type of Gaussian noise added to the training set, ‘singleNormal’ or ‘multiNormal’. Default: ‘multiNormal’
factor_sigma (float, optional) – For the case of ‘singleNormal’, it is the factor of the observational error (standard deviation), while for the case of ‘multiNormal’ it is the standard deviation of the coefficient of the observational error (standard deviation). Default: 0.5
multi_noise (int, optional) – The number of realization of noise added to a spectrum. Default: 5
use_GPU (bool, optional) – If True, the noise will be generated by GPU, otherwise, it will be generated by CPU. Default: True
- class ecopann.data_simulator.CutParams(param_names, params_dict=None)[source]¶
Bases:
objectCut parameter samples that crossed the parameter limits.
- Parameters
param_names (list) – A list that contains parameter names.
params_dict (dict or None, optional) – Information of cosmological parameters that include the labels, the minimum values, and the maximum values. See
params_dict_zoo(). Default: None
- class ecopann.data_simulator.ParametersFilter(param_names, sim_params, params_space, prev_space, check_include=True, rel_dev_limit=0.2)[source]¶
Bases:
objectSelect cosmological parameters from a data set according to a given parameter space.
- Parameters
param_names (list) – A list that contains parameter names.
sim_params (array-like) – The simulated cosmological parameters with the shape of (N, n), where N is the number of samples and n is the number of parameters.
params_space (array-like) – The parameter space with the shape of (n, 2), where n is the number of parameters. For each parameter, it is: [lower_limit, upper_limit].
prev_space (array-like) – The parameter space of local simulated data (or mock data in previous step), with shape of (n, 2), where n is the number of parameters. For each parameter, it is: [lower_limit, upper_limit].
check_include (bool, optional) – If True, it will check whether
params_spaceis in the space ofsim_params, otherwise, do nothing. Default: Truerel_dev_limit (float, optional) – The limit of the relative deviation when
params_spaceis not in the space ofsim_params, the default is 20% (this means ifparams_spaceis \([-5\sigma, +5\sigma]\), it can deviate \(<1\sigma\) fromsim_params), note that it should be \(<0.4\) (the deviation \(<2\sigma\) for parameter space \([-5\sigma, +5\sigma]\)). Default: 0.2
- class ecopann.data_simulator.SimMultiSpectra(branch_n, N, model, param_names, chain=None, params_space=None, spaceSigma=5, params_dict=None, space_type='hypercube', cut_crossedLimit=True, cut_crossedBest=True, cross_best=False, local_samples=None, prevStep_data=None, check_include=True, rel_dev_limit=0.2)[source]¶
Bases:
SimSpectraSimulate training set containing multiple observations (for multi-branch network).
- Parameters
branch_n (int) – The number of branch of the network.
N (int) – The number of data to be simulated.
model (cosmological (or theoretical) model instance) – A cosmological (or theoretical) model instance that is used to simulate training set, it should contains a ‘simulate’ method, and ‘simulate’ should accept input of cosmological parameters, if you use the local data sets, it should also contain ‘load_params’, ‘load_params_space’, and ‘load_sample’ methods.
param_names (list) – A list that contains parameter names.
chain (array-like or None) – The predicted ANN chain in the previous step. If
chainis an array,params_spacewill be ignored. Ifchainis None,params_spaceshould be given. Default: Noneparams_space (array-like) – The parameter space with the shape of (n, 2), where n is the number of parameters. For each parameter, it is: [lower_limit, upper_limit].
spaceSigma (int or array-like, optional) – The size of parameter space to be learned. It is a int or a numpy array with shape of (n,), where n is the number of parameters, e.g. for spaceSigma=5, the parameter space to be learned is \([-5\sigma, +5\sigma]\). Default: 5
params_dict (dict or None, optional) – Information of cosmological parameters that include the labels, the minimum values, and the maximum values. See
params_dict_zoo(). Default: Nonespace_type (str, optional) – The type of parameter space. It can be ‘hypercube’, ‘LHS’, ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’. Default: ‘hypercube’
cut_crossedLimit (bool, optional) – If True, the data points that cross the parameter limits will be cut. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’. Default: True
cut_crossedBest (bool, optional) – If True, the folded data points that cross the best values will be cut. It is recommended to set it to True. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’, and when
cut_crossedLimit=False. Default: Truecross_best (bool, optional) – If True, the folded data points will cross the best values, otherwise, the folded data points will not cross the best values. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’, and when
cut_crossedLimit=Falseandcut_crossedBest=False. Default: Falselocal_samples (None, str, or list, optional) – Path of local samples, None, ‘sample’ or [‘sample’] or [‘sample_1’, ‘sample_2’, …]. If None, no local samples are used. Default: None
prevStep_data (None or list, optional) – Samples simulated in the previous step, if list, it should be [spectra, params]. The spectra or params has shape (N, n), where N is the number of spectra and n is the number of data points in a spectrum. Default: None
check_include (bool, optional) – If True, will check whether
params_spaceis in the space oflocal_samples, otherwise, do nothing. Default: Truerel_dev_limit (float, optional) – The limit of the relative deviation when
params_spaceis not in the space ofsim_params, the default is 20% (this means ifparams_spaceis \([-5\sigma, +5\sigma]\), it can deviate \(<1\sigma\) fromsim_params), note that it should be \(<0.4\) (the deviation \(<2\sigma\) for parameter space \([-5\sigma, +5\sigma]\)). Default: 0.2
- Variables
prev_space (array-like) – The parameter space of local simulated data (or mock data in previous step), with shape of (n, 2), where n is the number of parameters. For each parameter, it is: [lower_limit, upper_limit].
seed (None or int, optional) – Seed number which controls random draws. Default: None
Note
Either
chainorparams_spaceshould be given to simulate samples.
- class ecopann.data_simulator.SimParameters(param_names, chain=None, params_space=None, spaceSigma=5, params_dict=None, space_type='hypercube', cut_crossedLimit=True, cut_crossedBest=True, cross_best=False)[source]¶
Bases:
CutParamsSimulate parameters.
- Parameters
param_names (list) – A list that contains parameter names.
chain (array-like or None) – The predicted ANN chain in the previous step. If
chainis an array,params_spacewill be ignored. Ifchainis None,params_spaceshould be given. Default: Noneparams_space (array-like or None) – The parameter space with the shape of (n, 2), where n is the number of parameters. For each parameter, it is: [lower_limit, upper_limit]. This is only used for space_type=’hypercube’ and space_type=’LHS’ If
chainis an array,params_spacewill be ignored. Ifchainis None,params_spaceshould be given. Default: NonespaceSigma (int or array-like, optional) – The size of parameter space to be learned. It is a int or a numpy array with shape of (n,), where n is the number of parameters, e.g. for spaceSigma=5, the parameter space to be learned is \([-5\sigma, +5\sigma]\). Default: 5
params_dict (dict or None, optional) – Information of cosmological parameters that include the labels, the minimum values, and the maximum values. See
params_dict_zoo(). Default: Nonespace_type (str, optional) – The type of parameter space. It can be ‘hypercube’, ‘LHS’, ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’. Default: ‘hypercube’
cut_crossedLimit (bool, optional) – If True, the data points that cross the parameter limits will be cut. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’. Default: True
cut_crossedBest (bool, optional) – If True, the folded data points that cross the best values will be cut. It is recommended to set it to True. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, ‘or ‘posterior_hyperellipsoid’, and when
cut_crossedLimit=False. Default: Truecross_best (bool, optional) – If True, the folded data points will cross the best values, otherwise, the folded data points will not cross the best values. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’, and when
cut_crossedLimit=Falseandcut_crossedBest=False. Default: False
- Variables
seed (None or int, optional) – Seed number which controls random draws. Default: None
Note
Either
chainorparams_spaceshould be given to simulate samples.- property combinations¶
- hypercube(N)[source]¶
Generate samples uniformly in a hypercube parameter space using uniform distribution.
- Parameters
N (int) – The number of data to be simulated.
- Returns
Parameters.
- Return type
array-like
- hyperellipsoid(N)[source]¶
Generate samples uniformly in a hyperellipsoid parameter space using covariance between parameters.
https://scipy-cookbook.readthedocs.io/items/CorrelatedRandomSamples.html https://blogs.sas.com/content/iml/2012/02/08/use-the-cholesky-transformation-to-correlate-and-uncorrelate-variables.html
- Parameters
N (int) – The number of data to be simulated.
- Returns
Parameters.
- Return type
array-like
Note
For Cholesky decomposition, the covariance matrix \(C = LL^T\). So, the transformation relationship between correlated parameters \(P_{corr}\) and uncorrelated parameters \(P_{uncorr}\) is \(P_{corr} = LP_{uncorr}\), \(P_{uncorr} = L^{-1}P_{corr}\)
- hypersphere(N)[source]¶
Generate samples uniformly in a hypersphere parameter space.
- Parameters
N (int) – The number of data to be simulated.
- Returns
Parameters.
- Return type
array-like
- in_polygon(edge, x, y, get_points=True)[source]¶
Judge whether the given points are in the area surrounded by the polygon.
- Parameters
edge (array-like) – 2-D array with shape (N, 2). The vertices of a polygon.
x (array-like) – 1-D array with shape (M,). The x coordinate of the data points.
y (array-like) – 1-D array with shape (M,). The y coordinate of the data points.
get_points (bool, optional) – If True, it will return data points inside the area, if False, it will return a bool array which is True if the (closed) path contains the corresponding point. Default: True
- Returns
Points in the polygon.
- Return type
array-like
- lhs(N)[source]¶
Generate samples uniformly in a hypercube parameter space using Latin hypercube sampling.
https://en.wikipedia.org/wiki/Latin_hypercube_sampling https://blog.csdn.net/yuxeaotao/article/details/108952326
- Parameters
N (int) – The number of data to be simulated.
- Returns
Parameters.
- Return type
array-like
- property params_n¶
- random_ball(N, dimension, radius=1)[source]¶
Generate samples uniformly in a ball with N dimension (hypersphere).
https://www.cnpython.com/qa/349434 https://www.zhihu.com/question/277712372 https://blogs.sas.com/content/iml/2016/04/06/generate-points-uniformly-in-ball.html https://arxiv.org/pdf/1404.1347.pdf https://www.sciencedirect.com/science/article/pii/S0047259X10001211
- class ecopann.data_simulator.SimSpectra(N, model, param_names, chain=None, params_space=None, spaceSigma=5, params_dict=None, space_type='hypercube', cut_crossedLimit=True, cut_crossedBest=True, cross_best=False, local_samples=None, prevStep_data=None, check_include=True, rel_dev_limit=0.2)[source]¶
Bases:
SimParametersSimulate training set.
- Parameters
N (int) – The number of data to be simulated.
model (cosmological (or theoretical) model instance) – A cosmological (or theoretical) model instance that is used to simulate training set, it should contains a ‘simulate’ method, and ‘simulate’ should accept input of cosmological parameters, if you use the local data sets, it should also contain ‘load_params’, ‘load_params_space’, and ‘load_sample’ methods.
param_names (list) – A list that contains parameter names.
chain (array-like or None) – The predicted ANN chain in the previous step. If
chainis an array,params_spacewill be ignored. Ifchainis None,params_spaceshould be given. Default: Noneparams_space (array-like or None) – The parameter space with the shape of (n, 2), where n is the number of parameters. For each parameter, it is: [lower_limit, upper_limit]. This is only used for space_type=’hypercube’ and space_type=’LHS’ If
chainis an array,params_spacewill be ignored. Ifchainis None,params_spaceshould be given. Default: NonespaceSigma (int or array-like, optional) – The size of parameter space to be learned. It is a int or a numpy array with shape of (n,), where n is the number of parameters, e.g. for spaceSigma=5, the parameter space to be learned is \([-5\sigma, +5\sigma]\). Default: 5
params_dict (dict or None, optional) – Information of cosmological parameters that include the labels, the minimum values, and the maximum values. See
params_dict_zoo(). Default: Nonespace_type (str, optional) – The type of parameter space. It can be ‘hypercube’, ‘LHS’, ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’. Default: ‘hypercube’
cut_crossedLimit (bool, optional) – If True, the data points that cross the parameter limits will be cut. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’. Default: True
cut_crossedBest (bool, optional) – If True, the folded data points that cross the best values will be cut. It is recommended to set it to True. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’, and when
cut_crossedLimit=False. Default: Truecross_best (bool, optional) – If True, the folded data points will cross the best values, otherwise, the folded data points will not cross the best values. This only works when space_type is ‘hypersphere’, ‘hyperellipsoid’, or ‘posterior_hyperellipsoid’, and when
cut_crossedLimit=Falseandcut_crossedBest=False. Default: Falselocal_samples (None, str, or list, optional) – Path of local samples, None, ‘sample’ or [‘sample’] or [‘sample_1’, ‘sample_2’, …]. If None, no local samples are used. Default: None
prevStep_data (None or list, optional) – Samples simulated in the previous step, if list, it should be [spectra, params]. The spectra or params has shape (N, n), where N is the number of spectra and n is the number of data points in a spectrum. Default: None
check_include (bool, optional) – If True, will check whether
params_spaceis in the space oflocal_samples, otherwise, do nothing. Default: Truerel_dev_limit (float, optional) – The limit of the relative deviation when
params_spaceis not in the space ofsim_params, the default is 20% (this means ifparams_spaceis \([-5\sigma, +5\sigma]\), it can deviate \(<1\sigma\) fromsim_params), note that it should be \(<0.4\) (the deviation \(<2\sigma\) for parameter space \([-5\sigma, +5\sigma]\)). Default: 0.2
- Variables
prev_space (array-like) – The parameter space of local simulated data (or mock data in previous step), with shape of (n, 2), where n is the number of parameters. For each parameter, it is: [lower_limit, upper_limit].
seed (None or int, optional) – Seed number which controls random draws. Default: None
Note
Either
chainorparams_spaceshould be given to simulate samples.- filter_localSample(local_sample, N_local)[source]¶
Select samples from the local data sets.
- Parameters
- Returns
The selected spectra and parameters.
- Return type
array-like
Note
Parameter space of the local samples should be in the initial parameter space.
ecopann.data_processor¶
- class ecopann.data_processor.InverseNormalize(x1, statistic={}, norm_type='z_score', a=0, b=1)[source]¶
Bases:
objectInverse transformation of class
Normalize.
- class ecopann.data_processor.Normalize(x, statistic={}, norm_type='z_score', a=0, b=1)[source]¶
Bases:
objectNormalize data.
- minmax()[source]¶
min-max normalization
Rescaling the range of features to scale the range in [0, 1] or [a,b] https://en.wikipedia.org/wiki/Feature_scaling
- class ecopann.data_processor.ParamsScaling(params_base)[source]¶
Bases:
objectData preprocessing of cosmological parameters.
- Parameters
params_base (array-like) – A 1-D array that contains the base values of the cosmological parameters.
- class ecopann.data_processor.Statistic(x)[source]¶
Bases:
objectStatistics of an array.
- property mean¶
- property std¶
- property xmax¶
- property xmin¶
- ecopann.data_processor.cpu2cuda(data)[source]¶
Transfer data from CPU to GPU.
- Parameters
data (array-like or tensor) – Numpy array or torch tensor.
- Raises
TypeError – The data type should be
np.ndarrayortorch.Tensor.- Returns
Torch tensor.
- Return type
Tensor
- ecopann.data_processor.cuda2numpy(data)[source]¶
Transfer data from the torch tensor (on GPU) to the numpy array (on CPU).
- ecopann.data_processor.numpy2cuda(data, device=None)[source]¶
Transfer data from the numpy array (on CPU) to the torch tensor (on GPU).
- ecopann.data_processor.numpy2torch(data)[source]¶
Transfer data from the numpy array (on CPU) to the torch tensor (on CPU).