blog

Pushback to the Future: Predict Pushback Time at US Airports - Benchmark


by Robert Gibboni

Pushback to the Future: Predict Pushback Time at US Airports - Benchmark

Coordinating our nation’s airways is the role of the National Airspace System (NAS). The NAS is one of the most complex transportation systems in the world. Operational changes can save or cost airlines, taxpayers, consumers, and the economy at large thousands to millions of dollars on a regular basis. It is critical that decisions to change procedures are done with as much lead time and certainty as possible. One significant source of uncertainty comes right at the beginning of a flight: the pushback time. A more accurate pushback time can lead to better predictability of take off time from the runway.

For this competition, your task is to train a machine learning model to automatically predict pushback time from public air traffic and weather data. In the Open Arena, you will work with 2 years of data to train a model and submit predictions for a validation set. In the Prescreened Arena, you will submit your trained models and inference code to run on a test set of held out data.

In this post, we'll give a quick tour of the features and labels and demonstrate a simple benchmark. That should give you an idea for how to start creating your own solutions and submitting to the Open Arena and Prescreened Arena.

The competition includes data for 10 airports spread throughout the continental US. Here's a map showing their locations.

Location of airports

In [1]:
from datetime import timedelta
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

airports = [
    "KATL",
    "KCLT",
    "KDEN",
    "KDFW",
    "KJFK",
    "KMEM",
    "KMIA",
    "KORD",
    "KPHX",
    "KSEA",
]

Let's start by looking at a single airport KATL in Atlanta, GA.

In [2]:
airport = "KATL"

Get the data

To run this notebook, you'll first need to download the following files from the Data download page (available after you've joined the competition):

  • Airport data (<airport>.tar): A tar archive containing labels and air traffic and weather features for a single airport from 2020 - 2022. Each feature is saved as a bzip2-compressed CSV. There is one file per airport.

  • Submission format (submission_format.csv): The simplest valid submission to the Open Arena that predicts 0 minutes to pushback for all flights. Use this as an example of a properly formatted submission.

Note that in the Prescreened Arena, you may use the Open Arena's validation set as training data. Those training labels are available as the "Training labels" file via the Prescreened Arena's Data download page.

Once you have downloaded the data files, you can unpack the tar archives. For example, unpack KATL with:

tar -xvf KATL.tar

# or to extract all of them at once
find . -name 'K*.tar' -exec tar xvf {} \;

which should result in the following files:

├── KATL
│   ├── KATL_config.csv.bz2
│   ├── KATL_etd.csv.bz2
│   ├── KATL_first_position.csv.bz2
│   ├── KATL_lamp.csv.bz2
│   ├── KATL_mfs.csv.bz2
│   ├── KATL_runways.csv.bz2
│   ├── KATL_standtimes.csv.bz2
│   └── KATL_tbfm.csv.bz2
└── train_labels_KATL.csv.bz2

Then you can delete the tar file. The rest of this notebook assumes that the data are extracted to a directory named data in the same directory as this notebook.

In [3]:
DATA_DIRECTORY = Path("./data")

We can start by looking at the prediction target: pushback time.

Let's look at the first few labels for KATL:

In [4]:
airport = "KATL"
pushback = pd.read_csv(DATA_DIRECTORY / f"train_labels_{airport}.csv.bz2")
pushback
Out[4]:
gufi timestamp airport minutes_until_pushback
0 AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM 2021-04-03 19:30:00 KATL 114
1 AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM 2021-04-03 19:45:00 KATL 99
2 AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM 2021-04-03 20:00:00 KATL 84
3 AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM 2021-04-03 20:15:00 KATL 69
4 AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM 2021-04-03 20:30:00 KATL 54
... ... ... ... ...
3194027 XSR729.ATL.TLH.210426.2354.0037.TFM 2021-04-27 17:45:00 KATL 75
3194028 XSR729.ATL.TLH.210426.2354.0037.TFM 2021-04-27 18:00:00 KATL 60
3194029 XSR729.ATL.TLH.210426.2354.0037.TFM 2021-04-27 18:15:00 KATL 45
3194030 XSR729.ATL.TLH.210426.2354.0037.TFM 2021-04-27 18:30:00 KATL 30
3194031 XSR729.ATL.TLH.210426.2354.0037.TFM 2021-04-27 18:45:00 KATL 15

3194032 rows × 4 columns

Check out the Problem Description for an explanation of the labels and features.

The "Fuser ETD minus 15 minutes benchmark"

For this benchmark, we'll use the existing Fuser estimated time of departure (ETD) as the basis for our solution. Fuser is a data processing platform designed by NASA as part of the ATD-2 project that processes the FAA's raw data stream and distributes cleaned, real-time data on the status of individual flights nationwide.

Let's take a look at Fuser ETD data. This table tracks the estimated departure time for flights departing from an airport. It typically contains many estimates for each flight:

  • gufi: GUFI (Global Unique Flight Identifier)
  • timestamp: The time that the prediction was generated
  • estimated_runway_departure_time: Estimated time that the flight will depart from the runway

Note that the ETD in this table refers to the time the flight will depart from the runway, whereas the prediction target we're after is when the flight pushes back from the gate. To account for that, we'll subtract 15 minutes from the estimated_runway_departure_time as an estimate of pushback time. This is just a rough estimate; a slightly less simple model could learn the proper adjustment from the data!

In [6]:
etd = pd.read_csv(
    DATA_DIRECTORY / airport / f"{airport}_etd.csv.bz2",
    parse_dates=["departure_runway_estimated_time", "timestamp"],
)
etd
Out[6]:
gufi timestamp departure_runway_estimated_time
0 FFT17.ATL.MBJ.211031.1050.0029.TFM 2021-11-01 07:00:13 2021-11-01 11:06:00
1 AAR2513.ATL.ICN.211101.0200.0185.TMA 2021-11-01 07:00:23 2021-11-01 05:01:00
2 FFT100.ATL.SJU.211031.1625.0067.TFM 2021-11-01 07:00:29 2021-11-01 16:41:00
3 FFT419.ATL.DEN.211031.1625.0073.TFM 2021-11-01 07:00:45 2021-11-01 16:39:00
4 FFT421.ATL.DEN.211101.0140.0090.TFM 2021-11-01 07:00:49 2021-11-02 01:52:00
... ... ... ...
13327016 FFT1516.ATL.MIA.211030.2150.0049.TFM 2021-10-31 22:59:06 2021-10-31 22:11:00
13327017 SWA3427.ATL.MIA.211031.0150.0072.TFM 2021-10-31 22:59:45 2021-11-01 03:08:00
13327018 RPA4778.ATL.ORD.211030.2055.0058.TFM 2021-10-31 22:59:52 2021-10-31 22:39:00
13327019 RPA4778.ATL.ORD.211030.2055.0058.TFM 2021-10-31 22:59:56 2021-10-31 22:39:00
13327020 DAL663.ATL.SNA.211031.0015.0043.TFM 2021-10-31 22:59:59 2021-11-01 00:27:00

13327021 rows × 3 columns

Submission format

The submission format gives us a list of flights and times for which we'll need to make predictions.

In [5]:
submission_format = pd.read_csv(
    DATA_DIRECTORY / "submission_format.csv", parse_dates=["timestamp"]
)
submission_format
Out[5]:
gufi timestamp airport minutes_until_pushback
0 AAL1008.ATL.DFW.210607.2033.0110.TFM 2021-06-08 19:15:00 KATL 0
1 AAL1008.ATL.DFW.210607.2033.0110.TFM 2021-06-08 19:30:00 KATL 0
2 AAL1008.ATL.DFW.210607.2033.0110.TFM 2021-06-08 19:45:00 KATL 0
3 AAL1008.ATL.DFW.210607.2033.0110.TFM 2021-06-08 20:00:00 KATL 0
4 AAL1008.ATL.DFW.210607.2033.0110.TFM 2021-06-08 20:15:00 KATL 0
... ... ... ... ...
2042718 XOJ760.SEA.SJC.210606.0435.0007.TFM 2021-06-06 20:45:00 KSEA 0
2042719 XOJ760.SEA.SJC.210606.0435.0007.TFM 2021-06-06 21:00:00 KSEA 0
2042720 XOJ760.SEA.SJC.210606.0435.0007.TFM 2021-06-06 21:15:00 KSEA 0
2042721 XOJ760.SEA.SJC.210606.0435.0007.TFM 2021-06-06 21:30:00 KSEA 0
2042722 XOJ760.SEA.SJC.210606.0435.0007.TFM 2021-06-06 21:45:00 KSEA 0

2042723 rows × 4 columns

We can begin to prototype our solution using a single flight and prediction time. Let's use the 200th row of the submission format.

In [7]:
row = submission_format.iloc[200]
row
Out[7]:
gufi                      AAL1008.ATL.DFW.211204.2135.0163.TFM
timestamp                                  2021-12-05 21:00:00
airport                                                   KATL
minutes_until_pushback                                       0
Name: 200, dtype: object

This row assumes it is 2021-12-05 21:00:00 and we're looking for a time to pushback prediction for the flight with GUFI AAL1008.ATL.DFW.211204.2135.0163.TFM.

Now let's look at the ETD dataframe for entries for that flight:

In [8]:
etd.loc[etd.gufi == row.gufi]
Out[8]:
gufi timestamp departure_runway_estimated_time
596631 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-04 21:35:50 2021-12-05 21:39:00
596946 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-04 21:40:45 2021-12-05 21:56:00
604259 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 21:02:25 2021-12-05 21:56:00
606150 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 21:39:33 2021-12-05 21:56:00
606951 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 21:51:38 2021-12-05 21:51:00
607097 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 21:52:58 2021-12-05 21:51:00
609143 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 22:26:46 2021-12-05 21:51:00
609490 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 22:32:10 2021-12-05 21:51:00
609707 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 22:37:10 2021-12-05 21:51:00
610873 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 22:51:22 2021-12-05 21:51:00
611009 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 22:53:03 2021-12-05 21:51:00
611365 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 22:58:53 2021-12-05 21:51:00
617975 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 23:32:27 2021-12-05 21:51:00
619770 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 23:55:36 2021-12-05 21:51:00

Notice how the departure time was estimated many times during the lifespan of this flight.

How we handle time is a critical aspect of this competition: we must only use features from 30 hours before the prediction time up until the prediction time itself. The following cell will filter the ETD features to just that time period (as well as just those entries that relate to the flight we are predicting).

In [9]:
now_etd = etd.loc[
    (etd.timestamp > row.timestamp - timedelta(hours=30))
    & (etd.timestamp <= row.timestamp)
    & (etd.gufi == row.gufi)
]
now_etd
Out[9]:
gufi timestamp departure_runway_estimated_time
596631 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-04 21:35:50 2021-12-05 21:39:00
596946 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-04 21:40:45 2021-12-05 21:56:00

It looks like there are two ETD estimates in that time range. We will use the most recent, since we might expect it to be the most accurate.

Finally we subtract 15 minutes since pushback is typically about 15 minutes before the ETD.

In [10]:
flight_pushback = now_etd.iloc[-1].departure_runway_estimated_time - timedelta(
    minutes=15
)
flight_pushback
Out[10]:
Timestamp('2021-12-05 21:41:00')

The submission format expects a prediction of the number of minutes (as an integer) from the current prediction time to the estimated pushback time.

In [11]:
flight_minutes_to_pushback = np.round(
    (flight_pushback - row.timestamp).total_seconds() / 60
).astype(int)
flight_minutes_to_pushback
Out[11]:
41

Simple right? Now we can just repeat this process for each of the rows of the submission format and each airport in our dataset.

But first, we'll want to consider some speed optimizations for solution. Your submission to the Prescreened Arena will need to process 1,800 time points and run in under 10 hours. It would take a very long time to run this ETD lookup for each of the >300k rows for this airport. We can speed things up by taking advantage of parallelization.

Let's get all of the flights for one prediction time. (As you'll see later in this post, that's how the code execution environment processes your submission.) As you can see, we need to make predictions for 79 flights at KATL for this prediction time.

In [12]:
now_submission_format = submission_format.loc[
    (submission_format.timestamp == row.timestamp)
    & (submission_format.airport == airport)
].reset_index(drop=True)
now_submission_format
Out[12]:
gufi timestamp airport minutes_until_pushback
0 AAL1008.ATL.DFW.211204.2135.0163.TFM 2021-12-05 21:00:00 KATL 0
1 AAL378.ATL.MIA.211204.2035.0063.TFM 2021-12-05 21:00:00 KATL 0
2 DAL1017.ATL.DEN.211204.2220.0072.TFM 2021-12-05 21:00:00 KATL 0
3 DAL1039.ATL.LAS.211204.2200.0167.TFM 2021-12-05 21:00:00 KATL 0
4 DAL1204.ATL.JAN.211204.2145.0094.TFM 2021-12-05 21:00:00 KATL 0
... ... ... ... ...
74 SWA4105.ATL.MKE.211204.2215.0031.TFM 2021-12-05 21:00:00 KATL 0
75 SWA668.ATL.DCA.211204.2050.0136.TFM 2021-12-05 21:00:00 KATL 0
76 SWA803.ATL.MIA.211204.2105.0118.TFM 2021-12-05 21:00:00 KATL 0
77 UAL225.ATL.DEN.211204.2115.0037.TFM 2021-12-05 21:00:00 KATL 0
78 UAL279.ATL.EWR.211204.2100.0196.TFM 2021-12-05 21:00:00 KATL 0

79 rows × 4 columns

A fast way to get the latest ETD for a bunch of flights is to use pandas.DataFrame.groupby to group by GUFI, then take the last estimate for each flight (making sure we've sorted ETD by timestamp).

In [13]:
etd.sort_values("timestamp", inplace=True)
now_etd = etd.loc[
    (etd.timestamp > row.timestamp - timedelta(hours=30))
    & (etd.timestamp <= row.timestamp)
]
now_etd
latest_now_etd = now_etd.groupby("gufi").last().departure_runway_estimated_time

Finally, we can merge the submission format and latest ETDs on GUFI and generate our predictions for this batch.

In [14]:
departure_runway_estimated_time = now_submission_format.merge(
    latest_now_etd, how="left", on="gufi"
).departure_runway_estimated_time
departure_runway_estimated_time
Out[14]:
0    2021-12-05 21:56:00
1    2021-12-05 21:13:00
2    2021-12-05 22:30:00
3    2021-12-05 22:03:00
4    2021-12-05 21:47:00
             ...        
74   2021-12-05 22:29:00
75   2021-12-05 21:10:00
76   2021-12-05 21:28:00
77   2021-12-05 21:34:00
78   2021-12-05 21:10:00
Name: departure_runway_estimated_time, Length: 79, dtype: datetime64[ns]
In [15]:
estimated_pushback = (
    (
        departure_runway_estimated_time - now_submission_format.timestamp
    ).dt.total_seconds()
    / 60
) - 15

Since subtracting 15 minutes could put us in a situation where we are predicting a pushback time before the prediction time, we can clip our predictions to never be negative. We'll also cast the predictions to integer as required by the submission format.

In [16]:
estimated_pushback = estimated_pushback.clip(lower=0).astype(int)
estimated_pushback
Out[16]:
0     41
1      0
2     75
3     48
4     32
      ..
74    74
75     0
76    13
77    19
78     0
Length: 79, dtype: int64

Now let's wrap that all up in a function that takes a timestamp as input and outputs all of the predictions at an airport for that timestamp.

In [28]:
def estimate_pushback(now: pd.Timestamp) -> pd.Series:

    # subset submission format to the current prediction time
    now_submission_format = airport_submission_format.loc[
        airport_submission_format.timestamp == now
    ].reset_index(drop=True)

    # filter features to 30 hours before prediction time to prediction time
    now_etd = etd.loc[(etd.timestamp > now - timedelta(hours=30)) & (etd.timestamp <= now)]

    # get the latest ETD for each flight
    latest_now_etd = now_etd.groupby("gufi").last().departure_runway_estimated_time

    # merge the latest ETD with the flights we are predicting
    departure_runway_estimated_time = now_submission_format.merge(
        latest_now_etd, how="left", on="gufi"
    ).departure_runway_estimated_time

    now_prediction = now_submission_format.copy()

    now_prediction["minutes_until_pushback"] = (
        (departure_runway_estimated_time - now_submission_format.timestamp).dt.total_seconds() / 60
    ) - 15

    return now_prediction

We'll use tqdm.contrib.concurrent.process_map to parallelize the function and show a helpful progress bar.

In [18]:
from tqdm import tqdm
from tqdm.contrib.concurrent import process_map
In [41]:
airport_submission_format = submission_format.loc[submission_format.airport == airport]
predictions = process_map(
    estimate_pushback,
    pd.to_datetime(airport_submission_format.timestamp.unique()),
    chunksize=10,
)

pd.concat(predictions, ignore_index=True)
Out[41]:
gufi timestamp airport minutes_until_pushback
0 AAL1006.SEA.DFW.210827.1300.0052.TFM_TFDM 2021-08-28 00:00:00 KSEA 74.0
1 ASA107.SEA.ANC.210827.2140.0020.TFM 2021-08-28 00:00:00 KSEA 29.0
2 ASA1092.SEA.LAX.210827.2157.0001.TFM 2021-08-28 00:00:00 KSEA 66.0
3 ASA1146.SEA.AUS.210827.2140.0044.TFM 2021-08-28 00:00:00 KSEA 57.0
4 ASA123.SEA.FAI.210827.1442.0018.TFM 2021-08-28 00:00:00 KSEA 65.0
... ... ... ... ...
157315 UAL2436.SEA.DEN.210211.1321.0008.TFM 2021-02-12 12:00:00 KSEA 83.0
157316 UAL2436.SEA.DEN.210213.1321.0001.TFM 2021-02-14 12:00:00 KSEA 83.0
157317 UAL2436.SEA.DEN.210213.1321.0001.TFM 2021-02-14 12:15:00 KSEA 68.0
157318 UAL2436.SEA.DEN.210213.1321.0001.TFM 2021-02-14 12:30:00 KSEA 53.0
157319 UAL2436.SEA.DEN.210214.1321.0010.TFM 2021-02-15 12:00:00 KSEA 83.0

157320 rows × 4 columns

That's all of the predictions for KATL in about 2 minutes!

Now let's iterate over all the airports and generate predictions. We'll save individual airport predictions out and concatenate them together as a final step.

In [20]:
for airport in airports:
    print(f"Processing {airport}")
    airport_predictions_path = Path(f"validation_predictions_{airport}.csv.bz2")
    if airport_predictions_path.exists():
        print(f"Predictions for {airport} already exist.")
        continue

    # subset submission format to current airport
    airport_submission_format = submission_format.loc[
        submission_format.airport == airport
    ]

    # load airport's ETD data and sort by timestamp
    etd = pd.read_csv(
        DATA_DIRECTORY / airport / f"{airport}_etd.csv.bz2",
        parse_dates=["departure_runway_estimated_time", "timestamp"],
    ).sort_values("timestamp")

    # process all prediction times in parallel
    predictions = process_map(
        estimate_pushback,
        pd.to_datetime(airport_submission_format.timestamp.unique()),
        chunksize=20,
    )

    # concatenate individual prediction times to a single dataframe
    predictions = pd.concat(predictions, ignore_index=True)
    predictions["minutes_until_pushback"] = predictions.minutes_until_pushback.clip(
        lower=0
    ).astype(int)

    # reindex the predictions to match the expected ordering in the submission format
    predictions = (
        predictions.set_index(["gufi", "timestamp", "airport"])
        .loc[
            airport_submission_format.set_index(["gufi", "timestamp", "airport"]).index
        ]
        .reset_index()
    )

    # save the predictions for the current airport
    predictions.to_csv(airport_predictions_path, index=False)
Processing KATL
Processing KCLT
Processing KDEN
Processing KDFW
Processing KJFK
Processing KMEM
Processing KMIA
Processing KORD
Processing KPHX
Processing KSEA
In [59]:
predictions = []

for airport in airports:
    airport_predictions_path = Path(f"validation_predictions_{airport}.csv.bz2")
    predictions.append(pd.read_csv(airport_predictions_path, parse_dates=["timestamp"]))

predictions = pd.concat(predictions, ignore_index=True)
predictions["minutes_until_pushback"] = predictions.minutes_until_pushback.astype(int)
In [60]:
with pd.option_context("float_format", "{:.2f}".format):
    display(predictions.minutes_until_pushback.describe())
count   2042723.00
mean         47.64
std          31.54
min           0.00
25%          23.00
50%          45.00
75%          69.00
max        1469.00
Name: minutes_until_pushback, dtype: float64

Most of the predictions fall between 0 and 70 minutes. The peak at 0 is due to a number of negative predictions that we set to 0.

In [67]:
fig, ax = plt.subplots(figsize=(6, 4), dpi=150)
predictions.minutes_until_pushback.clip(lower=0, upper=200).hist(bins=np.arange(0, 200), ax=ax)
ax.set_title("Distribution of predicted minutes to pushback")
ax.set_ylabel("Number of predictions")
ax.set_xlabel("Minutes to pushback")
_ = plt.show()

It is a good idea to do a few final checks to make sure our prediction conforms to the proper submission format. The platform will reject submissions that do not match the provided submission format, but it's nice to check locally just to be sure.

In [22]:
assert (predictions.columns == submission_format.columns).all()
assert len(predictions) == len(submission_format)
assert predictions[["gufi", "timestamp", "airport"]].equals(
    submission_format[["gufi", "timestamp", "airport"]]
)

Submit predictions to the Open Arena

Finally we can load up all of the individual airport predictions, concatenate them, and submit them to the Open Arena for scoring!

We highly recommend saving your submission as a zipped CSV to drastically reduce the file size and upload time. pandas.to_csv can do this automatically if you give it a file path with the .zip extension.

In [23]:
predictions.to_csv("validation_predictions.zip", index=False)

Code submissions in the Prescreened Arena

In the Prescreened Arena, rather than submit predictions themselves, you'll submit your trained model and inference code, and we will compute predictions in our code execution environment.

The runtime repository has a ton of information including code examples you can use when constructing your own code submission, instructions for how to test your submission locally, and much more. For now, we'll simply show how to turn the Fuser ETD baseline solution into a valid code submission.

A code submission must include solution.py that implements two functions:

  • load_model that returns any model assets that are needed for prediction. This solution does not have any model assets, so the function can just return None.
  • predict that takes the features as inputs and outputs predictions for a set of flights at a single prediction time.

The code execution environment does a lot of the work for you: it loads the features, and subsets them to the valid time ranges for each prediction time. All you need to do is provide a function predict that takes as input:

  • A set of feature dataframes: These are already filtered to only the valid time range from 30 hours before the prediction time up until the prediction time. You can use these features without restriction to generate your prediction.
  • A "partial submission format": A subset of rows from the full submission format corresponding to all of the flights at a single prediction time for one airport.
  • Model assets: Whatever your load_model function returns, most likely your trained model.

and outputs predictions for all the flights in the partial submission format.

Now we'll show how to turn the Fuser ETD benchmark into a valid code submission.

In [24]:
"""Solution for the NASA Pushback to the Future competition."""
from pathlib import Path
from typing import Any

from loguru import logger
import pandas as pd


def load_model(solution_directory: Path) -> Any:
    """Load any model assets from disk."""
    return


def predict(
    config: pd.DataFrame,
    etd: pd.DataFrame,
    first_position: pd.DataFrame,
    lamp: pd.DataFrame,
    mfs: pd.DataFrame,
    runways: pd.DataFrame,
    standtimes: pd.DataFrame,
    tbfm: pd.DataFrame,
    tfm: pd.DataFrame,
    airport: str,
    prediction_time: pd.Timestamp,
    partial_submission_format: pd.DataFrame,
    model: Any,
    solution_directory: Path,
) -> pd.DataFrame:
    """Make predictions for the a set of flights at a single airport and prediction time."""
    logger.info("Computing prediction based on Fuser ETD")

    latest_etd = etd.sort_values("timestamp").groupby("gufi").last().departure_runway_estimated_time
    departure_runway_estimated_time = partial_submission_format.merge(
        latest_etd, how="left", on="gufi"
    ).departure_runway_estimated_time

    prediction = partial_submission_format.copy()
    prediction["minutes_until_pushback"] = (
        (departure_runway_estimated_time - partial_submission_format.timestamp).dt.total_seconds()
        / 60
    ) - 15

    prediction["minutes_until_pushback"] = prediction.minutes_until_pushback.clip(lower=0).fillna(
        30
    )

    return prediction
In [25]:
!zip solution.zip solution.py
updating: solution.py (deflated 60%)

Now we have a submission that we can upload to the Prescreened Arena (provided that you've already been prescreened)!

Since it can take several hours to run the submission on the full test data, it's a good idea to first submit a "smoke test" version, which runs on a few hours of data from the training set. Smoke tests are only for helping quickly debug your submission, and scores on the smoke test are not counted in the leaderboard.

Once the smoke test has completed successfully, go ahead and submit your solution for evaluation on the full test set.

Even this simple solution takes about 5 hours to run, so you should definitely consider ways to optimize how your solution processes features and performs inference.

That concludes our benchmark! Head over to the competition home page to get started building your own solution. We're looking forward to seeing what you come up with!