nn.conversion¶
Common logic for converting olmo_core.nn features to/from other formats (like Hugging Face).
- class olmo_core.nn.conversion.StateConverter(mapping_templates)[source]¶
Bases:
objectA class for converting state from one format to another format (e.g. OLMo Core to HF).
Warning
This is a beta feature! The API is subject to change even with minor and patch releases. If you choose to use this feature please read the CHANGELOG before upgrading your version of this library.
- get_mappings(state_dict, placeholder_bounds, state_type='weight')[source]¶
Gets the state mapping from the given state dict to the converted format, without performing conversion.
- Parameters:
state_dict (
Dict[str,Any]) – The state dictionary in unconverted format.placeholder_bounds (
Dict[TemplatePlaceholder,int]) – Upper bound values for any relevant placeholders (e.g. forTemplatePlaceholder.EXPERT, the number of experts).state_type (
StateType, default:'weight') – The type of state this state dict corresponds to. Defaults toStateType.weight.
- Return type:
- convert(state_dict, placeholder_bounds, state_type='weight')[source]¶
Converts a state dict to another format. This currently only supports tensor values.
- Parameters:
state_dict (
Dict[str,Any]) – The state dictionary to convert.placeholder_bounds (
Dict[TemplatePlaceholder,int]) – Upper bound values for any relevant placeholders (e.g. forTemplatePlaceholder.EXPERT, the number of experts).state_type (
StateType, default:'weight') – The type of state this state dict corresponds to. Defaults toStateType.weight.
- Return type:
- class olmo_core.nn.conversion.StateMapping(source_keys, dest_keys, state_type='weight', source_concat_dim=0, unflatten_dim=None, dims_permutation=None, flatten_dims=None, dest_chunk_dim=0)[source]¶
Bases:
objectA mapping from state from one format to another format (e.g. OLMo Core to HF).
The most standard mapping is a one-to-one state mapping, which corresponds to a single string entry for both
source_keysanddest_keys. The class also supports more complicated mappings, like many-to-many mappings or mappings that also require further manipulations of state like permuting dimensions.-
source_concat_dim:
int= 0¶ When many states are being mapping from, this specifies the dimension on which to combine them.
-
unflatten_dim:
Optional[Tuple[int,Tuple[int,...]]] = None¶ This specifies that the given dimension (
unflatten_dim[0]) should be unflattened using the shape given inunflatten_dim[1].
-
dims_permutation:
Optional[Tuple[int,...]] = None¶ This specifies the permutation that should be applied to the dimensions of the state after any unflattening from
unflatten_dimhas occurred.
-
source_concat_dim:
- class olmo_core.nn.conversion.StateMappingTemplate(source_template_keys, dest_template_keys, state_type='weight', source_key_per_placeholder=None, dest_key_per_placeholder=None, source_concat_dim=0, unflatten_dim=None, dims_permutation=None, flatten_dims=None, dest_chunk_dim=0)[source]¶
Bases:
objectThe template for a mapping state from one format to another format (e.g. OLMo Core to HF). These mappings are ‘templates’ since they support keys and other metadata having placeholders for information like the layer number or number of MoE experts. This class can be converted to a
StateMappingby providing the placeholder information.The most standard mapping is a one-to-one state mapping, which corresponds to a single string entry for both
source_template_keysanddest_template_keys. The class also supports more complicated mappings, like many-to-many mappings or mappings that also require further manipulations of state like permuting dimensions.-
source_template_keys:
Union[str,Tuple[str,...]]¶ The key or keys of the state(s) being mapping from.
-
source_key_per_placeholder:
Optional[TemplatePlaceholder] = None¶ A placeholder in
source_template_keysfor which this mapping should map all valid placeholder values, rather than 1 specific value. For example, this enables mapping states from all experts (usingTemplatePlaceholder.EXPERT) to a single state.When provided,
source_template_keysmust be a string.
-
dest_key_per_placeholder:
Optional[TemplatePlaceholder] = None¶ A placeholder in
dest_template_keysfor which this mapping should map all valid placeholder values, rather than 1 specific value. For example, this enables mapping from a single state to states from all experts (usingTemplatePlaceholder.EXPERT).When provided,
dest_template_keysmust be a string.
-
source_concat_dim:
int= 0¶ When many states are being mapping from, this specifies the dimension on which to combine them.
-
unflatten_dim:
Optional[Tuple[int,Tuple[TemplatePlaceholder|int,...]]] = None¶ This specifies that the given dimension (
unflatten_dim[0]) should be unflattened using the shape given inunflatten_dim[1]. A placeholder can be given instead of a number, to represent its corresponding upper bound (e.g.TemplatePlaceholder.EXPERTrepresents the number of experts).
-
dims_permutation:
Optional[Tuple[int,...]] = None¶ This specifies the permutation that should be applied to the dimensions of the state after any unflattening from
unflatten_dimhas occurred.
-
source_template_keys: