Module panama.ml.benchmark.forecasting

Classes

class BenchmarkCarryOnForecaster (value_col: str, freq: str, timestamp_col: str)

An abstract base class for benchmark carry-on forecasters.

Attributes

freq
A string representing the frequency of the time series.
timestamp_col
A string representing the name of the timestamp column in the time series.
value_col
A string representing the name of the column containing the target values in the time series.

Initializes the abstract base class with the specified parameters.

Args

value_col
A string representing the name of the column containing the target values in the time series.
freq
A string representing the frequency of the time series.
timestamp_col
A string representing the name of the timestamp column in the time series.
Expand source code
class BenchmarkCarryOnForecaster(PythonModel):
    """An abstract base class for benchmark carry-on forecasters.

    Attributes:
        freq: A string representing the frequency of the time series.
        timestamp_col: A string representing the name of the timestamp column in the time series.
        value_col: A string representing the name of the column containing the target values in the time series.
    """

    def __init__(self, value_col: str, freq: str, timestamp_col: str):
        """Initializes the abstract base class with the specified parameters.

        Args:
            value_col: A string representing the name of the column containing the target values in the time series.
            freq: A string representing the frequency of the time series.
            timestamp_col: A string representing the name of the timestamp column in the time series.
        """
        self.freq = freq.upper()
        self.timestamp_col = timestamp_col
        self.value_col = value_col

    def _extract_date_hour(self, ts: pd.DataFrame, timestamp_col: str = None) -> Tuple[pd.Series, pd.Series]:
        """Extracts the date and hour components from a timestamp series.

        Args:
            ts_series: A pandas series containing timestamp values.

        Returns:
            A tuple of two pandas series, representing the date and hour components of the timestamp values.
        """
        timestamp_col = timestamp_col if timestamp_col else self.timestamp_col
        ts["date"] = pd.to_datetime(ts[timestamp_col]).dt.date
        ts["hour"] = pd.to_datetime(ts[timestamp_col]).dt.hour
        return ts

    @abstractmethod
    def fit(self, ts: pd.DataFrame):
        """Fits the forecaster to the specified time series data.

        Args:
            ts: A pandas dataframe containing the time series data.

        Raises:
            NotImplementedError: Subclass must implement fit method.
        """
        raise NotImplementedError("Subclass must implement fit method.")

    @abstractmethod
    def predict(self, context, model_input: List[str], params=None):
        """Predicts future values of the time series.

        Args:
            start: A string representing the start date of the forecast period.
            end: A string representing the end date of the forecast period.

        Returns:
            A pandas dataframe containing the predicted values of the time series.

        Raises:
            NotImplementedError: Subclass must implement predict method.
        """
        raise NotImplementedError("Subclass must implement predict method.")

Ancestors

  • mlflow.pyfunc.model.PythonModel

Subclasses

Methods

def fit(self, ts: pandas.core.frame.DataFrame)

Fits the forecaster to the specified time series data.

Args

ts
A pandas dataframe containing the time series data.

Raises

NotImplementedError
Subclass must implement fit method.
def predict(self, context, model_input: List[str], params=None)

Predicts future values of the time series.

Args

start
A string representing the start date of the forecast period.
end
A string representing the end date of the forecast period.

Returns

A pandas dataframe containing the predicted values of the time series.

Raises

NotImplementedError
Subclass must implement predict method.
class BenchmarkPrevWeekCarryOnForecaster (value_col: str, freq: str, timestamp_col: str)

A benchmark carry-on forecaster that predicts the value from the same day and hour of the previous week.

Inherits from BenchmarkCarryOnForecaster.

Attributes

freq
A string representing the frequency of the time series.
timestamp_col
A string representing the name of the timestamp column in the time series.
value_col
A string representing the name of the column containing the target values in the time series.
calibration
A pandas dataframe containing the calibration data for the forecaster.

Initializes the forecaster with the specified parameters.

Args

value_col
A string representing the name of the column containing the target values in the time series.
freq
A string representing the frequency of the time series.
timestamp_col
A string representing the name of the timestamp column in the time series.
Expand source code
class BenchmarkPrevWeekCarryOnForecaster(BenchmarkCarryOnForecaster):
    """A benchmark carry-on forecaster that predicts the value from the same day and hour of the previous week.

    Inherits from `BenchmarkCarryOnForecaster`.

    Attributes:
        freq: A string representing the frequency of the time series.
        timestamp_col: A string representing the name of the timestamp column in the time series.
        value_col: A string representing the name of the column containing the target values in the time series.
        calibration: A pandas dataframe containing the calibration data for the forecaster.
    """

    def __init__(self, value_col: str, freq: str, timestamp_col: str):
        """Initializes the forecaster with the specified parameters.

        Args:
            value_col: A string representing the name of the column containing the target values in the time series.
            freq: A string representing the frequency of the time series.
            timestamp_col: A string representing the name of the timestamp column in the time series.
        """
        super().__init__(value_col=value_col, freq=freq, timestamp_col=timestamp_col)

    def fit(self, ts: pd.DataFrame, strategy: str, window_len: int = None) -> None:
        """Fits the forecaster to the specified time series data.

        Args:
            ts: A pandas dataframe containing the time series data.

        Raises:
            ValueError: If the specified frequency is not supported.
        """
        ts = self._extract_date_hour(ts)
        ts = ts.sort_values(["date", "hour"])
        ts["weekday"] = pd.to_datetime(ts["date"]).dt.day_name()
        if self.freq == "D":
            group_cols = ["weekday"]
        elif self.freq == "H":
            group_cols = ["weekday", "hour"]
        else:
            raise ValueError(f"Supported freq values are 'D' and 'H', not {self.freq}")

        ts = ts.groupby(group_cols)
        if window_len is not None:
            ts = ts.tail(window_len).groupby(group_cols)
        ts = ts.agg({self.value_col: strategy})
        self.model = ts.reset_index()[[*group_cols, self.value_col]]

    def predict(self, context, model_input: List[str], params=None):
        return self._predict(start=model_input[0], end=model_input[-1])

    def _predict(self, start: str, end: str) -> pd.DataFrame:
        """Predicts future values of the time series.

        Args:
            start: A string representing the start date of the forecast period.
            end: A string representing the end date of the forecast period.

        Returns:
            A pandas dataframe containing the predicted values of the time series.

        Raises:
            ValueError: If the specified frequency is not supported.
        """
        future_idx = pd.date_range(start=start, end=end, freq=self.freq)
        future = pd.DataFrame({"idx": future_idx})
        future = self._extract_date_hour(future, "idx")
        future["weekday"] = pd.to_datetime(future["date"]).dt.day_name()
        if self.freq == "D":
            join_key = ["weekday"]
        elif self.freq == "H":
            join_key = ["weekday", "hour"]
        else:
            raise ValueError(f"Supported freq values are 'D' and 'H', not {self.freq}")
        future = future.merge(self.model, how="left", on=join_key)
        return future[["idx", self.value_col]].rename(columns={"idx": self.timestamp_col})

Ancestors

Methods

def fit(self, ts: pandas.core.frame.DataFrame, strategy: str, window_len: int = None) ‑> None

Fits the forecaster to the specified time series data.

Args

ts
A pandas dataframe containing the time series data.

Raises

ValueError
If the specified frequency is not supported.

Inherited members

class BenchmarkPrevYearCarryOnForecaster (value_col: str, timestamp_col: str)

A benchmark carry-on forecaster that predicts the value from the same time period in the previous year.

Inherits from BenchmarkCarryOnForecaster.

Attributes

freq
A string representing the frequency of the time series.
timestamp_col
A string representing the name of the timestamp column in the time series.
value_col
A string representing the name of the column containing the target values in the time series.
calibration
A pandas dataframe containing the calibration data for the forecaster.

Initializes the forecaster with the specified parameters.

Args

value_col
A string representing the name of the column containing the target values in the time series.
freq
A string representing the frequency of the time series.
timestamp_col
A string representing the name of the timestamp column in the time series.
Expand source code
class BenchmarkPrevYearCarryOnForecaster(BenchmarkCarryOnForecaster):
    """A benchmark carry-on forecaster that predicts the value from the same time period in the previous year.

    Inherits from `BenchmarkCarryOnForecaster`.

    Attributes:
        freq: A string representing the frequency of the time series.
        timestamp_col: A string representing the name of the timestamp column in the time series.
        value_col: A string representing the name of the column containing the target values in the time series.
        calibration: A pandas dataframe containing the calibration data for the forecaster.
    """

    def __init__(self, value_col: str, timestamp_col: str):
        """Initializes the forecaster with the specified parameters.

        Args:
            value_col: A string representing the name of the column containing the target values in the time series.
            freq: A string representing the frequency of the time series.
            timestamp_col: A string representing the name of the timestamp column in the time series.
        """
        super().__init__(value_col=value_col, freq="MS", timestamp_col=timestamp_col)

    def fit(self, ts: pd.DataFrame):
        """Fits the forecaster to the specified time series data.

        Args:
            ts: A pandas dataframe containing the time series data.

        Raises:
            ValueError: If the specified frequency is not supported.
        """
        ts = self._extract_date_hour(ts)
        ts = ts.sort_values(["date", "hour"])
        ts["month"] = pd.to_datetime(ts["date"]).dt.month
        ts = ts.groupby("month").last()
        self.model = ts.reset_index()[["month", self.value_col]]

    def predict(self, context, model_input: List[str], params=None):
        return self._predict(start=model_input[0], end=model_input[-1])

    def _predict(self, start: str, end: str) -> pd.DataFrame:
        """Predicts future values of the time series.

        Args:
            start: A string representing the start date of the forecast period.
            end: A string representing the end date of the forecast period.

        Returns:
            A pandas dataframe containing the predicted values of the time series.

        Raises:
            ValueError: If the specified frequency is not supported.
        """
        future_idx = pd.date_range(start=start, end=end, freq=self.freq)
        future = pd.DataFrame({"idx": future_idx})
        future = self._extract_date_hour(future, "idx")
        future["month"] = pd.to_datetime(future["date"]).dt.month
        future = future.merge(self.model, how="left", on="month")
        return future[["idx", self.value_col]].rename(columns={"idx": self.timestamp_col})

Ancestors

Methods

def fit(self, ts: pandas.core.frame.DataFrame)

Fits the forecaster to the specified time series data.

Args

ts
A pandas dataframe containing the time series data.

Raises

ValueError
If the specified frequency is not supported.

Inherited members