描述#
目前,生态系统中的各个库提供了各种用于种子设置伪随机数生成的 API。本 SPEC 建议一个统一的、实用的 API,并考虑了技术和历史因素。采用这种统一的 API 将简化用户体验,特别是对于依赖多个项目的用户。
我们建议
- 标准化
rng
关键字的使用和解释,用于种子设置,以及 - 避免使用全局状态和旧版比特流生成器。
我们建议通过以下方式实现这些原则
- 弃用现有的种子参数(通常为
random_state
或seed
)的使用,转而使用一致的rng
参数, - 使用
numpy.random.default_rng
规范化rng
参数并实例化一个Generator
1,以及 - 弃用使用
numpy.random.seed
来控制随机状态。
我们主要关注 API 的统一性,但也鼓励库转向使用 NumPy 伪随机 Generator
,因为
Generator
通过其 SeedSequence 机制避免了与简单种子设置(例如,使用连续整数)相关的问题;- 其使用避免了依赖全局状态,这可能会使代码执行更难以跟踪,并可能在并行处理场景中造成问题。
范围#
这旨在作为对所有允许用户控制 NumPy 随机数生成器状态的库的建议。它特别针对当前通过除 rng
之外的参数接受 RandomState
实例或允许 numpy.random.seed
控制随机状态的函数,但这些想法更广泛地适用。rng
关键字也可以适应 NumPy 以外提供的其他随机数生成器,但这超出了本 SPEC 的范围。
概念#
BitGenerator
:生成伪随机比特流。NumPy 中的默认生成器(numpy.random.default_rng
)使用 PCG64。Generator
:从BitGenerator
生成的比特派生伪随机数。RandomState
:NumPy 中的旧版对象,类似于Generator
,它基于梅森旋转算法生成随机数。
约束#
NumPy、SciPy、scikit-learn、scikit-image 和 NetworkX 都以略微不同的方式实现了伪随机种子设置。常见的关键字参数包括 random_state
和 seed
。在实践中,种子也经常可以使用 numpy.random.seed
进行控制。
核心项目认可#
对本 SPEC 的认可意味着一个项目认为 rng
关键字的标准化和解释以及避免使用全局状态和旧版比特流生成器是值得广泛实施的好主意。
生态系统采用#
要采用本 SPEC,项目应
- 弃用
random_state
/seed
参数,转而使用rng
参数,在所有需要用户控制伪随机数生成的函数中, - 使用
numpy.random.default_rng
规范化rng
参数并实例化一个Generator
,以及 - 弃用使用
numpy.random.seed
来控制随机状态。
徽章#
项目可以通过包含 SPEC 徽章来突出显示其对本 SPEC 的采用。
[](https://scientific-python.cn/specs/spec-0007/)
|SPEC 7 — Seeding pseudo-random number generation|
.. |SPEC 7 — Seeding pseudo-random number generation| image:: https://img.shields.io/badge/SPEC-7-green?labelColor=%23004811&color=%235CA038
:target: https://scientific-python.cn/specs/spec-0007/
实现#
scikit-learn(sklearn.utils.check_random_state
)等软件包中的旧版行为通常处理 None
(使用全局种子状态)、整数(转换为 RandomState
)或 RandomState
对象。
我们在此处的建议是一种弃用策略,并非在所有情况下都遵循 Hinsen 原则2,尽管它可以通过强制使用 rng
作为关键字参数来非常接近地做到这一点。
弃用策略如下所示。弃用策略
最初,接受 rng
和现有的 random_state
/seed
/...
关键字参数。
- 如果用户同时指定了这两个参数,则引发错误。
- 如果通过关键字传递
rng
,则使用np.random.default_rng()
对其进行规范化,并根据需要使用它来生成随机数。 - 如果指定了
random_state
/seed
/...
(通过关键字或位置,如果允许),则保留现有行为。
在 rng
在 SPEC 0 建议的支持窗口内所有版本中可用后,发出以下警告
-
如果既未指定
rng
也未指定random_state
/seed
/...
并且已使用np.random.seed
设置种子,则发出有关即将发生的运行时行为更改的FutureWarning
。 -
如果通过关键字或位置传递
random_state
/seed
/...
,则像以前一样处理它,但- 如果通过关键字传递,则发出
DeprecationWarning
,警告random_state
关键字将被弃用,转而使用rng
。 - 如果通过位置传递,则发出
FutureWarning
,警告位置参数的运行时行为将发生变化。
- 如果通过关键字传递,则发出
弃用期过后,仅接受 rng
,如果提供了 random_state
/seed
/...
,则引发错误。
到那时,带有类型注释的函数签名可能如下所示
from collections.abc import Sequence
import numpy as np
SeedLike = int | np.integer | Sequence[int] | np.random.SeedSequence
RNGLike = np.random.Generator | np.random.BitGenerator
def my_func(*, rng: RNGLike | SeedLike | None = None):
"""My function summary.
Parameters
----------
rng : `numpy.random.Generator`, optional
Pseudorandom number generator state. When `rng` is None, a new
`numpy.random.Generator` is created using entropy from the
operating system. Types other than `numpy.random.Generator` are
passed to `numpy.random.default_rng` to instantiate a `Generator`.
"""
rng = np.random.default_rng(rng)
...
另请注意 rng
参数文档字符串的建议语言,它鼓励用户传递 Generator
或 None
,但允许 numpy.random.default_rng
接受的其他类型(由类型注释捕获)。
影响#
有三类用户,其受影响程度不同。
-
那些不尝试控制随机状态的用户。他们的代码将从使用未设置种子的全局
RandomState
切换到使用未设置种子的Generator
。由于伪随机数的底层分布不会改变,因此这些用户应该基本不受影响。虽然从技术上讲此更改不符合 Hinsen 原则,但其影响应该最小。 -
random_state
/seed
参数的用户。对这些参数的支持最终将被删除,但在弃用期间,我们可以通过警告和文档提供明确的指导,说明如何迁移到新的rng
关键字。 -
使用
numpy.random.seed
的用户。该提案将取消该全局种子设置机制,这意味着在弃用期过后,依赖它的代码将从设置种子变为未设置种子。为了确保这一点不会被忽视,允许通过numpy.random.seed
控制随机状态的库应该在调用np.random.seed
时引发FutureWarning
。(有关示例,请参阅下面的 代码。)为了完全遵循 Hinsen 原则,这些警告应改为作为错误引发。作为回应,用户将不得不从使用numpy.random.seed
切换到显式地将rng
参数传递给所有接受它的函数。
代码#
例如,考虑 SciPy 函数如何使用装饰器从 random_state
参数转换为 rng
参数。
import numpy as np
import functools
import warnings
def _transition_to_rng(old_name, *, position_num=None, end_version=None):
"""Example decorator to transition from old PRNG usage to new `rng` behavior
Suppose the decorator is applied to a function that used to accept parameter
`old_name='random_state'` either by keyword or as a positional argument at
`position_num=1`. At the time of application, the name of the argument in the
function signature is manually changed to the new name, `rng`. If positional
use was allowed before, this is not changed.*
- If the function is called with both `random_state` and `rng`, the decorator
raises an error.
- If `random_state` is provided as a keyword argument, the decorator passes
`random_state` to the function's `rng` argument as a keyword. If `end_version`
is specified, the decorator will emit a `DeprecationWarning` about the
deprecation of keyword `random_state`.
- If `random_state` is provided as a positional argument, the decorator passes
`random_state` to the function's `rng` argument by position. If `end_version`
is specified, the decorator will emit a `FutureWarning` about the changing
interpretation of the argument.
- If `rng` is provided as a keyword argument, the decorator validates `rng` using
`numpy.random.default_rng` before passing it to the function.
- If `end_version` is specified and neither `random_state` nor `rng` is provided
by the user, the decorator checks whether `np.random.seed` has been used to set
the global seed. If so, it emits a `FutureWarning`, noting that usage of
`numpy.random.seed` will eventually have no effect. Either way, the decorator
calls the function without explicitly passing the `rng` argument.
If `end_version` is specified, a user must pass `rng` as a keyword to avoid warnings.
After the deprecation period, the decorator can be removed, and the function
can simply validate the `rng` argument by calling `np.random.default_rng(rng)`.
* A `FutureWarning` is emitted when the PRNG argument is used by
position. It indicates that the "Hinsen principle" (same
code yielding different results in two versions of the software)
will be violated, unless positional use is deprecated. Specifically:
- If `None` is passed by position and `np.random.seed` has been used,
the function will change from being seeded to being unseeded.
- If an integer is passed by position, the random stream will change.
- If `np.random` or an instance of `RandomState` is passed by position,
an error will be raised.
We suggest that projects consider deprecating positional use of
`random_state`/`rng` (i.e., change their function signatures to
``def my_func(..., *, rng=None)``); that might not make sense
for all projects, so this SPEC does not make that
recommendation, neither does this decorator enforce it.
Parameters
----------
old_name : str
The old name of the PRNG argument (e.g. `seed` or `random_state`).
position_num : int, optional
The (0-indexed) position of the old PRNG argument (if accepted by position).
Maintainers are welcome to eliminate this argument and use, for example,
`inspect`, if preferred.
end_version : str, optional
The full version number of the library when the behavior described in
`DeprecationWarning`s and `FutureWarning`s will take effect. If left
unspecified, no warnings will be emitted by the decorator.
"""
NEW_NAME = "rng"
cmn_msg = (
"To silence this warning and ensure consistent behavior in SciPy "
f"{end_version}, control the RNG using argument `{NEW_NAME}`. Arguments passed "
f"to keyword `{NEW_NAME}` will be validated by `np.random.default_rng`, so the "
"behavior corresponding with a given value may change compared to use of "
f"`{old_name}`. For example, "
"1) `None` will result in unpredictable random numbers, "
"2) an integer will result in a different stream of random numbers, (with the "
"same distribution), and "
"3) `np.random` or `RandomState` instances will result in an error. "
"See the documentation of `default_rng` for more information."
)
def decorator(fun):
@functools.wraps(fun)
def wrapper(*args, **kwargs):
# Determine how PRNG was passed
as_old_kwarg = old_name in kwargs
as_new_kwarg = NEW_NAME in kwargs
as_pos_arg = position_num is not None and len(args) >= position_num + 1
emit_warning = end_version is not None
# Can only specify PRNG one of the three ways
if int(as_old_kwarg) + int(as_new_kwarg) + int(as_pos_arg) > 1:
message = (
f"{fun.__name__}() got multiple values for "
f"argument now known as `{NEW_NAME}`"
)
raise TypeError(message)
# Check whether global random state has been set
global_seed_set = np.random.mtrand._rand._bit_generator._seed_seq is None
if as_old_kwarg: # warn about deprecated use of old kwarg
kwargs[NEW_NAME] = kwargs.pop(old_name)
if emit_warning:
message = (
f"Use of keyword argument `{old_name}` is "
f"deprecated and replaced by `{NEW_NAME}`. "
f"Support for `{old_name}` will be removed "
f"in SciPy {end_version}."
) + cmn_msg
warnings.warn(message, DeprecationWarning, stacklevel=2)
elif as_pos_arg:
# Warn about changing meaning of positional arg
# Note that this decorator does not deprecate positional use of the
# argument; it only warns that the behavior will change in the future.
# Simultaneously transitioning to keyword-only use is another option.
arg = args[position_num]
# If the argument is None and the global seed wasn't set, or if the
# argument is one of a few new classes, the user will not notice change
# in behavior.
ok_classes = (
np.random.Generator,
np.random.SeedSequence,
np.random.BitGenerator,
)
if (arg is None and not global_seed_set) or isinstance(arg, ok_classes):
pass
elif emit_warning:
message = (
f"Positional use of `{NEW_NAME}` (formerly known as "
f"`{old_name}`) is still allowed, but the behavior is "
"changing: the argument will be normalized using "
f"`np.random.default_rng` beginning in SciPy {end_version}, "
"and the resulting `Generator` will be used to generate "
"random numbers."
) + cmn_msg
warnings.warn(message, FutureWarning, stacklevel=2)
elif as_new_kwarg: # no warnings; this is the preferred use
# After the removal of the decorator, normalization with
# np.random.default_rng will be done inside the decorated function
kwargs[NEW_NAME] = np.random.default_rng(kwargs[NEW_NAME])
elif global_seed_set and emit_warning:
# Emit FutureWarning if `np.random.seed` was used and no PRNG was passed
message = (
"The NumPy global RNG was seeded by calling "
f"`np.random.seed`. Beginning in {end_version}, this "
"function will no longer use the global RNG."
) + cmn_msg
warnings.warn(message, FutureWarning, stacklevel=2)
return fun(*args, **kwargs)
return wrapper
return decorator
# Example usage of _prepare_rng decorator.
# Suppose a library uses a custom random state normalisation function, such as
from scipy._lib._util import check_random_state
# https://github.com/scipy/scipy/blob/94532e74b902b569bfad504866cb53720c5f4f31/scipy/_lib/_util.py#L253
# Suppose a function `library_function` is defined as:
def library_function(arg1, random_state=None, arg2=0):
random_state = check_random_state(random_state)
return random_state.random() * arg1 + arg2
# We apply the decorator and change the function signature at the same time.
# The use of `random_state` throughout the function may be replaced with `rng`,
# or the variable may be defined as `random_state = rng`.
@_transition_to_rng("random_state", position_num=1)
def library_function(arg1, rng=None, arg2=0):
rng = check_random_state(rng)
return rng.random() * arg1 + arg2
# After `rng` is available in all releases within the support window suggested by
# SPEC 0, we pass the `end_version` param to the decorator to emit warnings.
@_transition_to_rng("random_state", position_num=1, end_version="1.17.0")
def library_function(arg1, rng=None, arg2=0):
rng = check_random_state(rng)
return rng.random() * arg1 + arg2
# At the end of the deprecation period, remove the decorator, and normalize
# `rng` with` np.random.default_rng`.
def library_function(arg1, rng=None, arg2=0):
rng = np.random.default_rng(rng)
return rng.random() * arg1 + arg2