Qlib Recorder: 实验管理

简介

Qlib 包含一个名为 QlibRecorder 的实验管理系统,旨在帮助用户以高效的方式处理实验和分析结果。

系统由三个组件组成:

  • ExperimentManager

    管理实验的类。

  • Experiment

    实验类,每个实例负责一个实验。

  • Recorder

    记录器类,每个实例负责一次运行。

以下是系统结构的概览:

ExperimentManager
    - Experiment 1
        - Recorder 1
        - Recorder 2
        - ...
    - Experiment 2
        - Recorder 1
        - Recorder 2
        - ...
    - ...

    - ...本实验管理系统定义了一组接口并提供了一个具体实现 ``MLflowExpManager``,它基于机器学习平台 ``MLFlow``(链接:<https://mlflow.org/>)。

如果用户将 ExpManager 的实现设置为 MLflowExpManager,可以使用命令 mlflow ui 来可视化和检查实验结果。有关更多信息,请参阅相关文档 这里

Qlib Recorder

QlibRecorder 为用户提供了一个高级API,用于使用实验管理系统。接口被包装在 Qlib 中的变量 R 中,用户可以直接使用 R 与系统进行交互。下面的命令显示了如何在Python中导入 R

from qlib.workflow import R

QlibRecorder 包括一些常用的API,用于在工作流中管理 experimentsrecorders。有关更多可用的API,请参阅下面关于 Experiment ManagerExperimentRecorder 的部分。

下面是 QlibRecorder 的可用接口:

class qlib.workflow.__init__.QlibRecorder(exp_manager: ExpManager)

A global system that helps to manage the experiments.

__init__(exp_manager: ExpManager)
start(*, experiment_id: str | None = None, experiment_name: str | None = None, recorder_id: str | None = None, recorder_name: str | None = None, uri: str | None = None, resume: bool = False)

Method to start an experiment. This method can only be called within a Python’s with statement. Here is the example code:

# start new experiment and recorder
with R.start(experiment_name='test', recorder_name='recorder_1'):
    model.fit(dataset)
    R.log...
    ... # further operations

# resume previous experiment and recorder
with R.start(experiment_name='test', recorder_name='recorder_1', resume=True): # if users want to resume recorder, they have to specify the exact same name for experiment and recorder.
    ... # further operations
参数:
  • experiment_id (str) – id of the experiment one wants to start.

  • experiment_name (str) – name of the experiment one wants to start.

  • recorder_id (str) – id of the recorder under the experiment one wants to start.

  • recorder_name (str) – name of the recorder under the experiment one wants to start.

  • uri (str) – The tracking uri of the experiment, where all the artifacts/metrics etc. will be stored. The default uri is set in the qlib.config. Note that this uri argument will not change the one defined in the config file. Therefore, the next time when users call this function in the same experiment, they have to also specify this argument with the same value. Otherwise, inconsistent uri may occur.

  • resume (bool) – whether to resume the specific recorder with given name under the given experiment.

start_exp(*, experiment_id=None, experiment_name=None, recorder_id=None, recorder_name=None, uri=None, resume=False)

Lower level method for starting an experiment. When use this method, one should end the experiment manually and the status of the recorder may not be handled properly. Here is the example code:

R.start_exp(experiment_name='test', recorder_name='recorder_1')
... # further operations
R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
参数:
  • experiment_id (str) – id of the experiment one wants to start.

  • experiment_name (str) – the name of the experiment to be started

  • recorder_id (str) – id of the recorder under the experiment one wants to start.

  • recorder_name (str) – name of the recorder under the experiment one wants to start.

  • uri (str) – the tracking uri of the experiment, where all the artifacts/metrics etc. will be stored. The default uri are set in the qlib.config.

  • resume (bool) – whether to resume the specific recorder with given name under the given experiment.

返回类型:

An experiment instance being started.

end_exp(recorder_status='FINISHED')

Method for ending an experiment manually. It will end the current active experiment, as well as its active recorder with the specified status type. Here is the example code of the method:

R.start_exp(experiment_name='test')
... # further operations
R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
参数:

status (str) – The status of a recorder, which can be SCHEDULED, RUNNING, FINISHED, FAILED.

search_records(experiment_ids, **kwargs)

Get a pandas DataFrame of records that fit the search criteria.

The arguments of this function are not set to be rigid, and they will be different with different implementation of ExpManager in Qlib. Qlib now provides an implementation of ExpManager with mlflow, and here is the example code of the method with the MLflowExpManager:

R.log_metrics(m=2.50, step=0)
records = R.search_records([experiment_id], order_by=["metrics.m DESC"])
参数:
  • experiment_ids (list) – list of experiment IDs.

  • filter_string (str) – filter query string, defaults to searching all runs.

  • run_view_type (int) – one of enum values ACTIVE_ONLY, DELETED_ONLY, or ALL (e.g. in mlflow.entities.ViewType).

  • max_results (int) – the maximum number of runs to put in the dataframe.

  • order_by (list) – list of columns to order by (e.g., “metrics.rmse”).

返回:

  • A pandas.DataFrame of records, where each metric, parameter, and tag

  • are expanded into their own columns named metrics., params.*, and tags.**

  • respectively. For records that don’t have a particular metric, parameter, or tag, their

  • value will be (NumPy) Nan, None, or None respectively.

list_experiments()

Method for listing all the existing experiments (except for those being deleted.)

exps = R.list_experiments()
返回类型:

A dictionary (name -> experiment) of experiments information that being stored.

list_recorders(experiment_id=None, experiment_name=None)

Method for listing all the recorders of experiment with given id or name.

If user doesn’t provide the id or name of the experiment, this method will try to retrieve the default experiment and list all the recorders of the default experiment. If the default experiment doesn’t exist, the method will first create the default experiment, and then create a new recorder under it. (More information about the default experiment can be found here).

Here is the example code:

recorders = R.list_recorders(experiment_name='test')
参数:
  • experiment_id (str) – id of the experiment.

  • experiment_name (str) – name of the experiment.

返回类型:

A dictionary (id -> recorder) of recorder information that being stored.

get_exp(*, experiment_id=None, experiment_name=None, create: bool = True, start: bool = False) Experiment

Method for retrieving an experiment with given id or name. Once the create argument is set to True, if no valid experiment is found, this method will create one for you. Otherwise, it will only retrieve a specific experiment or raise an Error.

  • If ‘create’ is True:

    • If active experiment exists:

      • no id or name specified, return the active experiment.

      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name.

    • If active experiment not exists:

      • no id or name specified, create a default experiment, and the experiment is set to be active.

      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given name or the default experiment.

  • Else If ‘create’ is False:

    • If active experiment exists:

      • no id or name specified, return the active experiment.

      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.

    • If active experiment not exists:

      • no id or name specified. If the default experiment exists, return it, otherwise, raise Error.

      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.

Here are some use cases:

# Case 1
with R.start('test'):
    exp = R.get_exp()
    recorders = exp.list_recorders()

# Case 2
with R.start('test'):
    exp = R.get_exp(experiment_name='test1')

# Case 3
exp = R.get_exp() -> a default experiment.

# Case 4
exp = R.get_exp(experiment_name='test')

# Case 5
exp = R.get_exp(create=False) -> the default experiment if exists.
参数:
  • experiment_id (str) – id of the experiment.

  • experiment_name (str) – name of the experiment.

  • create (boolean) – an argument determines whether the method will automatically create a new experiment according to user’s specification if the experiment hasn’t been created before.

  • start (bool) – when start is True, if the experiment has not started(not activated), it will start It is designed for R.log_params to auto start experiments

返回类型:

An experiment instance with given id or name.

delete_exp(experiment_id=None, experiment_name=None)

Method for deleting the experiment with given id or name. At least one of id or name must be given, otherwise, error will occur.

Here is the example code:

R.delete_exp(experiment_name='test')
参数:
  • experiment_id (str) – id of the experiment.

  • experiment_name (str) – name of the experiment.

get_uri()

Method for retrieving the uri of current experiment manager.

Here is the example code:

uri = R.get_uri()
返回类型:

The uri of current experiment manager.

set_uri(uri: str | None)

Method to reset the default uri of current experiment manager.

NOTE:

  • When the uri is refer to a file path, please using the absolute path instead of strings like “~/mlruns/” The backend don’t support strings like this.

uri_context(uri: str)

Temporarily set the exp_manager’s default_uri to uri

NOTE: - Please refer to the NOTE in the set_uri

参数:

uri (Text) – the temporal uri

get_recorder(*, recorder_id=None, recorder_name=None, experiment_id=None, experiment_name=None) Recorder

Method for retrieving a recorder.

  • If active recorder exists:

    • no id or name specified, return the active recorder.

    • if id or name is specified, return the specified recorder.

  • If active recorder not exists:

    • no id or name specified, raise Error.

    • if id or name is specified, and the corresponding experiment_name must be given, return the specified recorder. Otherwise, raise Error.

The recorder can be used for further process such as save_object, load_object, log_params, log_metrics, etc.

Here are some use cases:

# Case 1
with R.start(experiment_name='test'):
    recorder = R.get_recorder()

# Case 2
with R.start(experiment_name='test'):
    recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d')

# Case 3
recorder = R.get_recorder() -> Error

# Case 4
recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d') -> Error

# Case 5
recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d', experiment_name='test')

Here are some things users may concern - Q: What recorder will it return if multiple recorder meets the query (e.g. query with experiment_name) - A: If mlflow backend is used, then the recorder with the latest start_time will be returned. Because MLflow’s search_runs function guarantee it

参数:
  • recorder_id (str) – id of the recorder.

  • recorder_name (str) – name of the recorder.

  • experiment_name (str) – name of the experiment.

返回类型:

A recorder instance.

delete_recorder(recorder_id=None, recorder_name=None)

Method for deleting the recorders with given id or name. At least one of id or name must be given, otherwise, error will occur.

Here is the example code:

R.delete_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d')
参数:
  • recorder_id (str) – id of the experiment.

  • recorder_name (str) – name of the experiment.

save_objects(local_path=None, artifact_path=None, **kwargs: Dict[str, Any])

Method for saving objects as artifacts in the experiment to the uri. It supports either saving from a local file/directory, or directly saving objects. User can use valid python’s keywords arguments to specify the object to be saved as well as its name (name: value).

In summary, this API is designs for saving objects to the experiments management backend path, 1. Qlib provide two methods to specify objects - Passing in the object directly by passing with **kwargs (e.g. R.save_objects(trained_model=model)) - Passing in the local path to the object, i.e. local_path parameter. 2. artifact_path represents the the experiments management backend path

  • If active recorder exists: it will save the objects through the active recorder.

  • If active recorder not exists: the system will create a default experiment, and a new recorder and save objects under it.

备注

If one wants to save objects with a specific recorder. It is recommended to first get the specific recorder through get_recorder API and use the recorder the save objects. The supported arguments are the same as this method.

Here are some use cases:

# Case 1
with R.start(experiment_name='test'):
    pred = model.predict(dataset)
    R.save_objects(**{"pred.pkl": pred}, artifact_path='prediction')
    rid = R.get_recorder().id
...
R.get_recorder(recorder_id=rid).load_object("prediction/pred.pkl")  #  after saving objects, you can load the previous object with this api

# Case 2
with R.start(experiment_name='test'):
    R.save_objects(local_path='results/pred.pkl', artifact_path="prediction")
    rid = R.get_recorder().id
...
R.get_recorder(recorder_id=rid).load_object("prediction/pred.pkl")  #  after saving objects, you can load the previous object with this api
参数:
  • local_path (str) – if provided, them save the file or directory to the artifact URI.

  • artifact_path (str) – the relative path for the artifact to be stored in the URI.

  • **kwargs (Dict[Text, Any]) – the object to be saved. For example, {“pred.pkl”: pred}

load_object(name: str)

Method for loading an object from artifacts in the experiment in the uri.

log_params(**kwargs)

Method for logging parameters during an experiment. In addition to using R, one can also log to a specific recorder after getting it with get_recorder API.

  • If active recorder exists: it will log parameters through the active recorder.

  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and log parameters under it.

Here are some use cases:

# Case 1
with R.start('test'):
    R.log_params(learning_rate=0.01)

# Case 2
R.log_params(learning_rate=0.01)
参数:

argument (keyword) – name1=value1, name2=value2, …

log_metrics(step=None, **kwargs)

Method for logging metrics during an experiment. In addition to using R, one can also log to a specific recorder after getting it with get_recorder API.

  • If active recorder exists: it will log metrics through the active recorder.

  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and log metrics under it.

Here are some use cases:

# Case 1
with R.start('test'):
    R.log_metrics(train_loss=0.33, step=1)

# Case 2
R.log_metrics(train_loss=0.33, step=1)
参数:

argument (keyword) – name1=value1, name2=value2, …

log_artifact(local_path: str, artifact_path: str | None = None)

Log a local file or directory as an artifact of the currently active run

  • If active recorder exists: it will set tags through the active recorder.

  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and set the tags under it.

参数:
  • local_path (str) – Path to the file to write.

  • artifact_path (Optional[str]) – If provided, the directory in artifact_uri to write to.

download_artifact(path: str, dst_path: str | None = None) str

Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.

参数:
  • path (str) – Relative source path to the desired artifact.

  • dst_path (Optional[str]) – Absolute path of the local filesystem destination directory to which to download the specified artifacts. This directory must already exist. If unspecified, the artifacts will either be downloaded to a new uniquely-named directory on the local filesystem.

返回:

Local path of desired artifact.

返回类型:

str

set_tags(**kwargs)

Method for setting tags for a recorder. In addition to using R, one can also set the tag to a specific recorder after getting it with get_recorder API.

  • If active recorder exists: it will set tags through the active recorder.

  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and set the tags under it.

Here are some use cases:

# Case 1
with R.start('test'):
    R.set_tags(release_version="2.2.0")

# Case 2
R.set_tags(release_version="2.2.0")
参数:

argument (keyword) – name1=value1, name2=value2, …

实验管理器

Qlib 中的 ExpManager 模块负责管理不同的实验。ExpManager 的大多数API与 QlibRecorder 类似,最重要的API是 get_exp 方法。用户可以直接参考上面的文档了解如何使用 get_exp 方法的详细信息。

class qlib.workflow.expm.ExpManager(uri: str, default_exp_name: str | None)

This is the ExpManager class for managing experiments. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)

The ExpManager is expected to be a singleton (btw, we can have multiple Experiment`s with different uri. user can get different experiments from different uri, and then compare records of them). Global Config (i.e. `C) is also a singleton.

So we try to align them together. They share the same variable, which is called default uri. Please refer to ExpManager.default_uri for details of variable sharing.

When the user starts an experiment, the user may want to set the uri to a specific uri (it will override default uri during this period), and then unset the specific uri and fallback to the default uri. ExpManager._active_exp_uri is that specific uri.

__init__(uri: str, default_exp_name: str | None)
start_exp(*, experiment_id: str | None = None, experiment_name: str | None = None, recorder_id: str | None = None, recorder_name: str | None = None, uri: str | None = None, resume: bool = False, **kwargs) Experiment

Start an experiment. This method includes first get_or_create an experiment, and then set it to be active.

Maintaining _active_exp_uri is included in start_exp, remaining implementation should be included in _end_exp in subclass

参数:
  • experiment_id (str) – id of the active experiment.

  • experiment_name (str) – name of the active experiment.

  • recorder_id (str) – id of the recorder to be started.

  • recorder_name (str) – name of the recorder to be started.

  • uri (str) – the current tracking URI.

  • resume (boolean) – whether to resume the experiment and recorder.

返回类型:

An active experiment.

end_exp(recorder_status: str = 'SCHEDULED', **kwargs)

End an active experiment.

Maintaining _active_exp_uri is included in end_exp, remaining implementation should be included in _end_exp in subclass

参数:
  • experiment_name (str) – name of the active experiment.

  • recorder_status (str) – the status of the active recorder of the experiment.

create_exp(experiment_name: str | None = None)

Create an experiment.

参数:

experiment_name (str) – the experiment name, which must be unique.

返回类型:

An experiment object.

抛出:

ExpAlreadyExistError

search_records(experiment_ids=None, **kwargs)

Get a pandas DataFrame of records that fit the search criteria of the experiment. Inputs are the search criteria user want to apply.

返回:

  • A pandas.DataFrame of records, where each metric, parameter, and tag

  • are expanded into their own columns named metrics., params.*, and tags.**

  • respectively. For records that don’t have a particular metric, parameter, or tag, their

  • value will be (NumPy) Nan, None, or None respectively.

get_exp(*, experiment_id=None, experiment_name=None, create: bool = True, start: bool = False)

Retrieve an experiment. This method includes getting an active experiment, and get_or_create a specific experiment.

When user specify experiment id and name, the method will try to return the specific experiment. When user does not provide recorder id or name, the method will try to return the current active experiment. The create argument determines whether the method will automatically create a new experiment according to user’s specification if the experiment hasn’t been created before.

  • If create is True:

    • If active experiment exists:

      • no id or name specified, return the active experiment.

      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name. If start is set to be True, the experiment is set to be active.

    • If active experiment not exists:

      • no id or name specified, create a default experiment.

      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name. If start is set to be True, the experiment is set to be active.

  • Else If create is False:

    • If active experiment exists:

      • no id or name specified, return the active experiment.

      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.

    • If active experiment not exists:

      • no id or name specified. If the default experiment exists, return it, otherwise, raise Error.

      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.

参数:
  • experiment_id (str) – id of the experiment to return.

  • experiment_name (str) – name of the experiment to return.

  • create (boolean) – create the experiment it if hasn’t been created before.

  • start (boolean) – start the new experiment if one is created.

返回类型:

An experiment object.

delete_exp(experiment_id=None, experiment_name=None)

Delete an experiment.

参数:
  • experiment_id (str) – the experiment id.

  • experiment_name (str) – the experiment name.

property default_uri

Get the default tracking URI from qlib.config.C

property uri

Get the default tracking URI or current URI.

返回类型:

The tracking URI string.

list_experiments()

List all the existing experiments.

返回类型:

A dictionary (name -> experiment) of experiments information that being stored.

对于其他接口,如 create_expdelete_exp,请参考 实验管理器 API

实验

Experiment 类只负责单个实验,并处理与实验相关的所有操作。该类包括诸如 startend 实验等基本方法。此外,还提供了与 记录器(recorders) 相关的方法,如 get_recorderlist_recorders

class qlib.workflow.exp.Experiment(id, name)

This is the Experiment class for each experiment being run. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)

__init__(id, name)
start(*, recorder_id=None, recorder_name=None, resume=False)

Start the experiment and set it to be active. This method will also start a new recorder.

参数:
  • recorder_id (str) – the id of the recorder to be created.

  • recorder_name (str) – the name of the recorder to be created.

  • resume (bool) – whether to resume the first recorder

返回类型:

An active recorder.

end(recorder_status='SCHEDULED')

End the experiment.

参数:

recorder_status (str) – the status the recorder to be set with when ending (SCHEDULED, RUNNING, FINISHED, FAILED).

create_recorder(recorder_name=None)

Create a recorder for each experiment.

参数:

recorder_name (str) – the name of the recorder to be created.

返回类型:

A recorder object.

search_records(**kwargs)

Get a pandas DataFrame of records that fit the search criteria of the experiment. Inputs are the search criteria user want to apply.

返回:

  • A pandas.DataFrame of records, where each metric, parameter, and tag

  • are expanded into their own columns named metrics., params.*, and tags.**

  • respectively. For records that don’t have a particular metric, parameter, or tag, their

  • value will be (NumPy) Nan, None, or None respectively.

delete_recorder(recorder_id)

Create a recorder for each experiment.

参数:

recorder_id (str) – the id of the recorder to be deleted.

get_recorder(recorder_id=None, recorder_name=None, create: bool = True, start: bool = False) Recorder

Retrieve a Recorder for user. When user specify recorder id and name, the method will try to return the specific recorder. When user does not provide recorder id or name, the method will try to return the current active recorder. The create argument determines whether the method will automatically create a new recorder according to user’s specification if the recorder hasn’t been created before.

  • If create is True:

    • If active recorder exists:

      • no id or name specified, return the active recorder.

      • if id or name is specified, return the specified recorder. If no such exp found, create a new recorder with given id or name. If start is set to be True, the recorder is set to be active.

    • If active recorder not exists:

      • no id or name specified, create a new recorder.

      • if id or name is specified, return the specified experiment. If no such exp found, create a new recorder with given id or name. If start is set to be True, the recorder is set to be active.

  • Else If create is False:

    • If active recorder exists:

      • no id or name specified, return the active recorder.

      • if id or name is specified, return the specified recorder. If no such exp found, raise Error.

    • If active recorder not exists:

      • no id or name specified, raise Error.

      • if id or name is specified, return the specified recorder. If no such exp found, raise Error.

参数:
  • recorder_id (str) – the id of the recorder to be deleted.

  • recorder_name (str) – the name of the recorder to be deleted.

  • create (boolean) – create the recorder if it hasn’t been created before.

  • start (boolean) – start the new recorder if one is created.

返回类型:

A recorder object.

list_recorders(rtype: Literal['dict', 'list'] = 'dict', **flt_kwargs) List[Recorder] | Dict[str, Recorder]

List all the existing recorders of this experiment. Please first get the experiment instance before calling this method. If user want to use the method R.list_recorders(), please refer to the related API document in QlibRecorder.

flt_kwargsdict

filter recorders by conditions e.g. list_recorders(status=Recorder.STATUS_FI)

返回:

if rtype == “dict”:

A dictionary (id -> recorder) of recorder information that being stored.

elif rtype == “list”:

A list of Recorder.

返回类型:

The return type depends on rtype

对于其他接口,如 search_recordsdelete_recorder,请参考 实验 API

Qlib 还提供了一个默认的 Experiment,在使用诸如 log_metricsget_exp 等 API 时将创建并使用该实验。如果使用默认的 Experiment,在运行 Qlib 时将会有相关的日志信息。用户可以在 Qlib 的配置文件中或在 Qlib初始化 过程中更改默认 Experiment 的名称,目前默认名称为 ‘Experiment’。

记录器

Recorder 类负责单个记录器的操作。记录器将处理某个运行过程中的一些详细操作,如 log_metricslog_params。它旨在帮助用户轻松跟踪运行过程中生成的结果和事物。

以下是在 QlibRecorder 中没有包含的一些重要 API:

class qlib.workflow.recorder.Recorder(experiment_id, name)

This is the Recorder class for logging the experiments. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)

The status of the recorder can be SCHEDULED, RUNNING, FINISHED, FAILED.

__init__(experiment_id, name)
save_objects(local_path=None, artifact_path=None, **kwargs)

Save objects such as prediction file or model checkpoints to the artifact URI. User can save object through keywords arguments (name:value).

Please refer to the docs of qlib.workflow:R.save_objects

参数:
  • local_path (str) – if provided, them save the file or directory to the artifact URI.

  • artifact_path=None (str) – the relative path for the artifact to be stored in the URI.

load_object(name)

Load objects such as prediction file or model checkpoints.

参数:

name (str) – name of the file to be loaded.

返回类型:

The saved object.

start_run()

Start running or resuming the Recorder. The return value can be used as a context manager within a with block; otherwise, you must call end_run() to terminate the current run. (See ActiveRun class in mlflow)

返回类型:

An active running object (e.g. mlflow.ActiveRun object).

end_run()

End an active Recorder.

log_params(**kwargs)

Log a batch of params for the current run.

参数:

arguments (keyword) – key, value pair to be logged as parameters.

log_metrics(step=None, **kwargs)

Log multiple metrics for the current run.

参数:

arguments (keyword) – key, value pair to be logged as metrics.

log_artifact(local_path: str, artifact_path: str | None = None)

Log a local file or directory as an artifact of the currently active run.

参数:
  • local_path (str) – Path to the file to write.

  • artifact_path (Optional[str]) – If provided, the directory in artifact_uri to write to.

set_tags(**kwargs)

Log a batch of tags for the current run.

参数:

arguments (keyword) – key, value pair to be logged as tags.

delete_tags(*keys)

Delete some tags from a run.

参数:

keys (series of strs of the keys) – all the name of the tag to be deleted.

list_artifacts(artifact_path: str | None = None)

List all the artifacts of a recorder.

参数:

artifact_path (str) – the relative path for the artifact to be stored in the URI.

返回类型:

A list of artifacts information (name, path, etc.) that being stored.

download_artifact(path: str, dst_path: str | None = None) str

Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.

参数:
  • path (str) – Relative source path to the desired artifact.

  • dst_path (Optional[str]) – Absolute path of the local filesystem destination directory to which to download the specified artifacts. This directory must already exist. If unspecified, the artifacts will either be downloaded to a new uniquely-named directory on the local filesystem.

返回:

Local path of desired artifact.

返回类型:

str

list_metrics()

List all the metrics of a recorder.

返回类型:

A dictionary of metrics that being stored.

list_params()

List all the params of a recorder.

返回类型:

A dictionary of params that being stored.

list_tags()

List all the tags of a recorder.

返回类型:

A dictionary of tags that being stored.

对于其他接口,如 save_objects, load_object,请参考 Recorder API

记录模板

RecordTemp 类是一种能够以特定格式生成实验结果(如IC和回测结果)的类。我们提供了三种不同的 记录模板 类:

  • SignalRecord: 该类生成模型的 预测结果

  • SigAnaRecord: 该类生成模型的 ICICIR等级IC等级ICIR

这是一个在 SigAnaRecord 中所做的简单示例,用户可以参考该示例以便使用自己的预测结果和标签来计算IC、等级IC、多空收益率。

from qlib.contrib.eva.alpha import calc_ic, calc_long_short_return

ic, ric = calc_ic(pred.iloc[:, 0], label.iloc[:, 0])
long_short_r, long_avg_r = calc_long_short_return(pred.iloc[:, 0], label.iloc[:, 0])
  • PortAnaRecord: 该类生成 回测 结果。关于 回测 的详细信息以及可用的 策略,用户可以参考 StrategyBacktest。以下是 PortAnaRecord 中的一个简单示例,用户可以参考该示例根据自己的预测和标签进行回测。

from qlib.contrib.strategy.strategy import TopkDropoutStrategy
from qlib.contrib.evaluate import (
    backtest as normal_backtest,
    risk_analysis,
)

# backtest
STRATEGY_CONFIG = {
    "topk": 50,
    "n_drop": 5,
}
BACKTEST_CONFIG = {
    "limit_threshold": 0.095,
    "account": 100000000,
    "benchmark": BENCHMARK,
    "deal_price": "close",
    "open_cost": 0.0005,
    "close_cost": 0.0015,
    "min_cost": 5,
}

strategy = TopkDropoutStrategy(**STRATEGY_CONFIG)
report_normal, positions_normal = normal_backtest(pred_score, strategy=strategy, **BACKTEST_CONFIG)

# analysis
analysis = dict()
analysis["excess_return_without_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"])
analysis["excess_return_with_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"] - report_normal["cost"])
analysis_df = pd.concat(analysis)  # type: pd.DataFrame
print(analysis_df)

有关 API 的更多信息,请参阅 记录模板 API

已知限制

  • Python 对象是基于 pickle 保存的,如果环境转储对象和加载对象的方式不同,可能会导致问题。