Stats

core.py

A module for core functionalities.

`bootstrap_metric(returns: np.ndarray, metric: str | Callable = 'sharpe_ratio', n_bootstraps: int = 1000, n_jobs: int = 2, min_length: int = 5, rng: None | Generator = None, **kwargs: dict) -> np.ndarray`

Compute input metric using bootstrapping procedure

Parameters:

returns (ndarray) –

a vector-like object of returns
metric (str | Callable, default: 'sharpe_ratio' ) –

input metric, either the name of a metric function (without 'compute_' prefix) defined in the 'stats' module or a callable that accepts returns as the first argument.
n_bootstraps (int, default: 1000 ) –

number of bootstrap samples
n_jobs (int, default: 2 ) –

number of parallel jobs in the computation
min_length (int, default: 5 ) –

minimum size of bootstrap sample
rng (None | Generator, default: None ) –

numpy random Generator
kwargs (dict, default: {} ) –

additional arguments passed to the metric function

Returns:	`ndarray` – the array contained the bootstrapped results

Example:

>>> import numpy as np
>>> returns = np.random.default_rng().normal(loc=0, scale=0.01, size=100)
>>> results = bootstrap_metric(returns, metric="sharpe_ratio", n_bootstraps=100)
>>> print(results.shape)
(100,)

Using a custom metric function:
>>> def mean_return(x):
...     return np.mean(x)
>>> results = bootstrap_metric(returns, metric=mean_return, n_bootstraps=100)
>>> np.mean(results)
0.0005  # (example output)

`compute_pct_returns(x: pd.Series) -> float`

Compute the percentage return of a series

Parameters:	`x` (`Series`) – Input pandas Series representing prices or values over time

Returns:	`float` – The percentage return computed as (last / first) - 1, `float` – or NaN if the first value is zero

Example:

>>> import pandas as pd
>>> s = pd.Series([100, 110])
>>> compute_pct_returns(s)
0.10

>>> s_zero = pd.Series([0, 110])
>>> compute_pct_returns(s_zero)
nan

`compute_returns(x: pd.Series) -> float`

Compute the absolute return of a series

Parameters:	`x` (`Series`) – Input pandas Series representing prices or values over time

Returns:	`float` – The absolute return computed as last value minus first value

Example:

>>> import pandas as pd
>>> s = pd.Series([100, 105, 110])
>>> compute_returns(s)
10.0

`compute_robust_distance(corr: pd.DataFrame) -> pd.DataFrame`

Compute a robust version of distance metric from correlation

Parameters:	`corr` (`DataFrame`) – input correlation matrix

Returns:	`DataFrame` – robust distance

Example:

>>> import pandas as pd
>>> import numpy as np
>>> corr = pd.DataFrame([[1, 0.5], [0.5, 1]], columns=["A", "B"], index=["A", "B"])
>>> dist = compute_robust_distance(corr)
>>> dist.loc["A", "B"]
0.7071067811865476

`estimate_correlation(returns: pd.DataFrame, method: str = 'empyrical', rolling_window: int = 5, n_bootstraps: int = 100, n_jobs: int = 2, min_length: int = 5, rng: None | Generator = None) -> pd.DataFrame`

Estimate a correlation matrix using a rolling window and bootstrap procedure.

Parameters:

returns (DataFrame) –

DataFrame of returns with assets as columns
method (str, default: 'empyrical' ) –

estimation method, can be 'empyrical', 'glassocv' or 'ledoit_wolf'
rolling_window (int, default: 5 ) –

Window size for rolling returns computation.
n_bootstraps (int, default: 100 ) –

Number of bootstrap samples.
n_jobs (int, default: 2 ) –

Number of parallel jobs.
min_length (int, default: 5 ) –

Minimum block length for bootstrap.
rng (None | Generator, default: None ) –

Random generator for reproducibility.

Returns:	`DataFrame` – DataFrame containing the estimated correlation matrix

Example:

>>> import pandas as pd
>>> import numpy as np
>>> returns = pd.DataFrame(np.random.normal(0, 0.01, (100, 3)),
...                        columns=["A", "B", "C"])
>>> corr = estimate_correlation(returns, method="ledoit_wolf", n_bootstraps=10)
>>> corr.shape
(3, 3)
>>> corr.columns.tolist()
['A', 'B', 'C']

`get_scorecard(portfolio: pd.DataFrame, freq: str = 'Y') -> pd.DataFrame`

Generate a performance scorecard of portfolio metrics aggregated by period.

Parameters:	`portfolio` (`DataFrame`) – DataFrame containing at least 'returns' or 'pnl' columns. If one is missing, it will be computed internally `freq` (`str`, default: `'Y'` ) – Resampling frequency: 'Y' (year), 'Q' (quarter), or 'M' (month)

Returns:	`DataFrame` – DataFrame with metrics such as Sharpe Ratio, Sortino Ratio, Max Drawdown, `DataFrame` – VaR, CVaR, and Final P&L for each period plus a total summary.

Example:

>>> import pandas as pd
>>> import numpy as np
>>> dates = pd.date_range("2020-01-01", periods=100, freq="D")
>>> pnl = np.cumsum(np.random.normal(0, 1, size=100))
>>> df = pd.DataFrame({"pnl": pnl}, index=dates)
>>> scorecard = get_scorecard(df, freq="M")
>>> print(scorecard)
Period         2020-M1   2020-M2  Total
Sharpe-Ratio    0.10      0.12     0.11
Sortino-Ratio   0.15      0.18     0.16
MaxDD          -0.25     -0.30    -0.28
VaR            -0.05     -0.04    -0.045
CVaR           -0.07     -0.06    -0.065
FinalP&L       12.34     14.56    26.90