io#

olmo_core.io.is_url(path)[source]#

Check if a path is a URL.

Parameters:

path (Union[Path, PathLike, str]) – Path-like object to check.

Return type:

bool

olmo_core.io.file_size(path)[source]#

Get the size of a local or remote file in bytes.

Parameters:

path (Union[Path, PathLike, str]) – Path/URL to the file.

Return type:

int

olmo_core.io.get_bytes_range(path, bytes_start, num_bytes)[source]#

Get a range of bytes from a file.

Parameters:
  • source – Path/URL to the file.

  • bytes_start (int) – Byte offset to start at.

  • num_bytes (int) – Number of bytes to get.

Return type:

bytes

olmo_core.io.upload(source, target, save_overwrite=False)[source]#

Upload source file to a target location on GCS or S3.

Parameters:
  • source (Union[Path, PathLike, str]) – Path to the file to upload.

  • target (str) – Target URL to upload to.

  • save_overwrite (bool, default: False) – Overwrite any existing file.

olmo_core.io.dir_is_empty(dir)[source]#

Check if a local directory is empty. This also returns true if the directory does not exist.

Parameters:

dir (Union[Path, PathLike, str]) – Path to the local directory.

Return type:

bool

olmo_core.io.file_exists(path)[source]#

Check if a file exists.

Parameters:

path (Union[Path, PathLike, str]) – Path/URL to a file.

Return type:

bool

olmo_core.io.clear_directory(dir)[source]#

Clear out the contents of a local or remote directory. GCS (gs://) and S3 (s3://) URLs are supported.

Parameters:

dir (Union[Path, PathLike, str]) – Path/URL to the directory.

olmo_core.io.serialize_to_tensor(x)[source]#

Serialize an object to a byte tensor using pickle.

Parameters:

x (Any) – The pickeable object to serialize.

Return type:

Tensor

olmo_core.io.deserialize_from_tensor(data)[source]#

Deserialize an object from a byte tensor using pickle.

Parameters:

data (Tensor) – The byte tensor to deserialize.

Return type:

Any