Module panama.ml.benchmark.forecasting
Classes
class BenchmarkCarryOnForecaster (value_col: str, freq: str, timestamp_col: str)
-
Expand source code
class BenchmarkCarryOnForecaster(PythonModel): """An abstract base class for benchmark carry-on forecasters. Attributes: freq: A string representing the frequency of the time series. timestamp_col: A string representing the name of the timestamp column in the time series. value_col: A string representing the name of the column containing the target values in the time series. """ def __init__(self, value_col: str, freq: str, timestamp_col: str): """Initializes the abstract base class with the specified parameters. Args: value_col: A string representing the name of the column containing the target values in the time series. freq: A string representing the frequency of the time series. timestamp_col: A string representing the name of the timestamp column in the time series. """ self.freq = freq.upper() self.timestamp_col = timestamp_col self.value_col = value_col def _extract_date_hour(self, ts: pd.DataFrame, timestamp_col: str = None) -> Tuple[pd.Series, pd.Series]: """Extracts the date and hour components from a timestamp series. Args: ts_series: A pandas series containing timestamp values. Returns: A tuple of two pandas series, representing the date and hour components of the timestamp values. """ timestamp_col = timestamp_col if timestamp_col else self.timestamp_col ts["date"] = pd.to_datetime(ts[timestamp_col]).dt.date ts["hour"] = pd.to_datetime(ts[timestamp_col]).dt.hour return ts @abstractmethod def fit(self, ts: pd.DataFrame): """Fits the forecaster to the specified time series data. Args: ts: A pandas dataframe containing the time series data. Raises: NotImplementedError: Subclass must implement fit method. """ raise NotImplementedError("Subclass must implement fit method.") @abstractmethod def predict(self, context, model_input: List[str], params=None): """Predicts future values of the time series. Args: start: A string representing the start date of the forecast period. end: A string representing the end date of the forecast period. Returns: A pandas dataframe containing the predicted values of the time series. Raises: NotImplementedError: Subclass must implement predict method. """ raise NotImplementedError("Subclass must implement predict method.")
An abstract base class for benchmark carry-on forecasters.
Attributes
freq
- A string representing the frequency of the time series.
timestamp_col
- A string representing the name of the timestamp column in the time series.
value_col
- A string representing the name of the column containing the target values in the time series.
Initializes the abstract base class with the specified parameters.
Args
value_col
- A string representing the name of the column containing the target values in the time series.
freq
- A string representing the frequency of the time series.
timestamp_col
- A string representing the name of the timestamp column in the time series.
Ancestors
- mlflow.pyfunc.model.PythonModel
Subclasses
Methods
def fit(self, ts: pandas.core.frame.DataFrame)
-
Expand source code
@abstractmethod def fit(self, ts: pd.DataFrame): """Fits the forecaster to the specified time series data. Args: ts: A pandas dataframe containing the time series data. Raises: NotImplementedError: Subclass must implement fit method. """ raise NotImplementedError("Subclass must implement fit method.")
Fits the forecaster to the specified time series data.
Args
ts
- A pandas dataframe containing the time series data.
Raises
NotImplementedError
- Subclass must implement fit method.
def predict(self, context, model_input: List[str], params=None)
-
Expand source code
@abstractmethod def predict(self, context, model_input: List[str], params=None): """Predicts future values of the time series. Args: start: A string representing the start date of the forecast period. end: A string representing the end date of the forecast period. Returns: A pandas dataframe containing the predicted values of the time series. Raises: NotImplementedError: Subclass must implement predict method. """ raise NotImplementedError("Subclass must implement predict method.")
Predicts future values of the time series.
Args
start
- A string representing the start date of the forecast period.
end
- A string representing the end date of the forecast period.
Returns
A pandas dataframe containing the predicted values of the time series.
Raises
NotImplementedError
- Subclass must implement predict method.
class BenchmarkPrevWeekCarryOnForecaster (value_col: str, freq: str, timestamp_col: str)
-
Expand source code
class BenchmarkPrevWeekCarryOnForecaster(BenchmarkCarryOnForecaster): """A benchmark carry-on forecaster that predicts the value from the same day and hour of the previous week. Inherits from `BenchmarkCarryOnForecaster`. Attributes: freq: A string representing the frequency of the time series. timestamp_col: A string representing the name of the timestamp column in the time series. value_col: A string representing the name of the column containing the target values in the time series. calibration: A pandas dataframe containing the calibration data for the forecaster. """ def __init__(self, value_col: str, freq: str, timestamp_col: str): """Initializes the forecaster with the specified parameters. Args: value_col: A string representing the name of the column containing the target values in the time series. freq: A string representing the frequency of the time series. timestamp_col: A string representing the name of the timestamp column in the time series. """ super().__init__(value_col=value_col, freq=freq, timestamp_col=timestamp_col) def fit(self, ts: pd.DataFrame, strategy: str, window_len: int = None) -> None: """Fits the forecaster to the specified time series data. Args: ts: A pandas dataframe containing the time series data. Raises: ValueError: If the specified frequency is not supported. """ ts = self._extract_date_hour(ts) ts = ts.sort_values(["date", "hour"]) ts["weekday"] = pd.to_datetime(ts["date"]).dt.day_name() if self.freq == "D": group_cols = ["weekday"] elif self.freq == "H": group_cols = ["weekday", "hour"] else: raise ValueError(f"Supported freq values are 'D' and 'H', not {self.freq}") ts = ts.groupby(group_cols) if window_len is not None: ts = ts.tail(window_len).groupby(group_cols) ts = ts.agg({self.value_col: strategy}) self.model = ts.reset_index()[[*group_cols, self.value_col]] def predict(self, context, model_input: List[str], params=None): return self._predict(start=model_input[0], end=model_input[-1]) def _predict(self, start: str, end: str) -> pd.DataFrame: """Predicts future values of the time series. Args: start: A string representing the start date of the forecast period. end: A string representing the end date of the forecast period. Returns: A pandas dataframe containing the predicted values of the time series. Raises: ValueError: If the specified frequency is not supported. """ future_idx = pd.date_range(start=start, end=end, freq=self.freq) future = pd.DataFrame({"idx": future_idx}) future = self._extract_date_hour(future, "idx") future["weekday"] = pd.to_datetime(future["date"]).dt.day_name() if self.freq == "D": join_key = ["weekday"] elif self.freq == "H": join_key = ["weekday", "hour"] else: raise ValueError(f"Supported freq values are 'D' and 'H', not {self.freq}") future = future.merge(self.model, how="left", on=join_key) return future[["idx", self.value_col]].rename(columns={"idx": self.timestamp_col})
A benchmark carry-on forecaster that predicts the value from the same day and hour of the previous week.
Inherits from
BenchmarkCarryOnForecaster
.Attributes
freq
- A string representing the frequency of the time series.
timestamp_col
- A string representing the name of the timestamp column in the time series.
value_col
- A string representing the name of the column containing the target values in the time series.
calibration
- A pandas dataframe containing the calibration data for the forecaster.
Initializes the forecaster with the specified parameters.
Args
value_col
- A string representing the name of the column containing the target values in the time series.
freq
- A string representing the frequency of the time series.
timestamp_col
- A string representing the name of the timestamp column in the time series.
Ancestors
- BenchmarkCarryOnForecaster
- mlflow.pyfunc.model.PythonModel
Methods
def fit(self, ts: pandas.core.frame.DataFrame, strategy: str, window_len: int = None) ‑> None
-
Expand source code
def fit(self, ts: pd.DataFrame, strategy: str, window_len: int = None) -> None: """Fits the forecaster to the specified time series data. Args: ts: A pandas dataframe containing the time series data. Raises: ValueError: If the specified frequency is not supported. """ ts = self._extract_date_hour(ts) ts = ts.sort_values(["date", "hour"]) ts["weekday"] = pd.to_datetime(ts["date"]).dt.day_name() if self.freq == "D": group_cols = ["weekday"] elif self.freq == "H": group_cols = ["weekday", "hour"] else: raise ValueError(f"Supported freq values are 'D' and 'H', not {self.freq}") ts = ts.groupby(group_cols) if window_len is not None: ts = ts.tail(window_len).groupby(group_cols) ts = ts.agg({self.value_col: strategy}) self.model = ts.reset_index()[[*group_cols, self.value_col]]
Fits the forecaster to the specified time series data.
Args
ts
- A pandas dataframe containing the time series data.
Raises
ValueError
- If the specified frequency is not supported.
Inherited members
class BenchmarkPrevYearCarryOnForecaster (value_col: str, timestamp_col: str)
-
Expand source code
class BenchmarkPrevYearCarryOnForecaster(BenchmarkCarryOnForecaster): """A benchmark carry-on forecaster that predicts the value from the same time period in the previous year. Inherits from `BenchmarkCarryOnForecaster`. Attributes: freq: A string representing the frequency of the time series. timestamp_col: A string representing the name of the timestamp column in the time series. value_col: A string representing the name of the column containing the target values in the time series. calibration: A pandas dataframe containing the calibration data for the forecaster. """ def __init__(self, value_col: str, timestamp_col: str): """Initializes the forecaster with the specified parameters. Args: value_col: A string representing the name of the column containing the target values in the time series. freq: A string representing the frequency of the time series. timestamp_col: A string representing the name of the timestamp column in the time series. """ super().__init__(value_col=value_col, freq="MS", timestamp_col=timestamp_col) def fit(self, ts: pd.DataFrame): """Fits the forecaster to the specified time series data. Args: ts: A pandas dataframe containing the time series data. Raises: ValueError: If the specified frequency is not supported. """ ts = self._extract_date_hour(ts) ts = ts.sort_values(["date", "hour"]) ts["month"] = pd.to_datetime(ts["date"]).dt.month ts = ts.groupby("month").last() self.model = ts.reset_index()[["month", self.value_col]] def predict(self, context, model_input: List[str], params=None): return self._predict(start=model_input[0], end=model_input[-1]) def _predict(self, start: str, end: str) -> pd.DataFrame: """Predicts future values of the time series. Args: start: A string representing the start date of the forecast period. end: A string representing the end date of the forecast period. Returns: A pandas dataframe containing the predicted values of the time series. Raises: ValueError: If the specified frequency is not supported. """ future_idx = pd.date_range(start=start, end=end, freq=self.freq) future = pd.DataFrame({"idx": future_idx}) future = self._extract_date_hour(future, "idx") future["month"] = pd.to_datetime(future["date"]).dt.month future = future.merge(self.model, how="left", on="month") return future[["idx", self.value_col]].rename(columns={"idx": self.timestamp_col})
A benchmark carry-on forecaster that predicts the value from the same time period in the previous year.
Inherits from
BenchmarkCarryOnForecaster
.Attributes
freq
- A string representing the frequency of the time series.
timestamp_col
- A string representing the name of the timestamp column in the time series.
value_col
- A string representing the name of the column containing the target values in the time series.
calibration
- A pandas dataframe containing the calibration data for the forecaster.
Initializes the forecaster with the specified parameters.
Args
value_col
- A string representing the name of the column containing the target values in the time series.
freq
- A string representing the frequency of the time series.
timestamp_col
- A string representing the name of the timestamp column in the time series.
Ancestors
- BenchmarkCarryOnForecaster
- mlflow.pyfunc.model.PythonModel
Methods
def fit(self, ts: pandas.core.frame.DataFrame)
-
Expand source code
def fit(self, ts: pd.DataFrame): """Fits the forecaster to the specified time series data. Args: ts: A pandas dataframe containing the time series data. Raises: ValueError: If the specified frequency is not supported. """ ts = self._extract_date_hour(ts) ts = ts.sort_values(["date", "hour"]) ts["month"] = pd.to_datetime(ts["date"]).dt.month ts = ts.groupby("month").last() self.model = ts.reset_index()[["month", self.value_col]]
Fits the forecaster to the specified time series data.
Args
ts
- A pandas dataframe containing the time series data.
Raises
ValueError
- If the specified frequency is not supported.
Inherited members