RFoT.utilities package
Contents
RFoT.utilities package#
Submodules#
RFoT.utilities.bin_columns module#
- RFoT.utilities.bin_columns.bin_columns(df: pandas.core.frame.DataFrame, bin_max_map, dont_bin: [], bin_scale=1.0)[source]#
Maps the tensor dimensions to bins for indexing the tensor. The number of bins is specified by the number of unique entries for the given dimension in the dataset, and scaled if specified.
- Parameters
df (pd.DataFrame) -- Dataset X.
bin_max_map (dict) -- Determines the maximum dimension size.
dont_bin ([]) -- List of column indixes from X to not to bin.
bin_scale (float, optional) -- Size of the dimension. The default is 1.0.
- Returns
df -- Dataset X with binned features.
- Return type
pd.DataFrame
RFoT.utilities.build_tensor module#
Builds a tensor in COO format.
- RFoT.utilities.build_tensor.setup_sptensor(df: pandas.core.frame.DataFrame, tensor_option: dict)[source]#
Sets up sparse tensor specified by non-zero coordinates and list of values for each coordinate entry (index) in the tensor.
- Parameters
df (pd.DataFrame) -- Dataset X.
tensor_option (dict) -- Dictionary of tensor configuration defining the column indices to be dimensions, tensor rank, and entry.
- Returns
Dicitonary with non-zero coordinates and non-zero values.
- Return type
dict
RFoT.utilities.istarmap module#
RFoT.utilities.sample_tensor_configs module#
Samples random tensor configurations.
- RFoT.utilities.sample_tensor_configs.setup_tensors(min_dimensions: int, max_dimensions: int, X: pandas.core.frame.DataFrame, random_state: int, n_estimators: int, rank, min_rank: int, max_rank: int)[source]#
Samples random set of tensor configurations specified by the dimension names, entry feature name, and the rank.
- Parameters
min_dimensions (int) -- Minimum number of dimensions.
max_dimensions (int) -- Maximum number of dimensions.
X (pd.DataFrame) -- Dataset X.
random_state (int) -- random state seed.
n_estimators (int) -- Number of random tensors.
rank (int) -- Tensor rank. Could ne "random".
min_rank (int) -- Minimum tensor rank.
max_rank (int) -- Maximum tensor rank.
- Returns
tensor_setups -- Dictionary of random tensor setups.
- Return type
dict