The Rise of AI Foundation Models

One surprising development, at least to me, in AI is the growing proliferation of foundational models. Text? Sure, makes sense. There’s lots of data out there. Then images, audio, video somehow seemed a natural continuation.

But now time series, tabular data, geospatial data, graphs? Foundation models built on public and synthetic data are producing workable results in all of these areas and more. I always thought the problems in these spaces were too unique for a foundation model approach, but it increasingly looks like I was wrong. It’s kind of mind blowing that models, often trained heavily on synthetic data, can deliver what appears to be strong out of sample performance in such specific domains.

As an example, you can see this in models like Chronos-2 for time series forecasting (as someone that deals with time series a lot- this model is an outstanding achievement). It is trained on large amounts of synthetic and public sequence data and yet seems to transfer well to real world forecasting tasks with relatively little adaptation.

As another example, a similar idea shows up in tabular data with models like TabPFN. Instead of training a new model for each dataset, it uses a pretrained probabilistic foundation trained over a distribution of synthetic tasks and then performs inference directly on new tables. In many cases it appears to be competitive with traditional pipelines, often with much less setup.

I had always assumed these problem classes were too structured and too domain specific for this to work well in practice. That assumption now feels questionable.

For a long time the working assumption was that these domains required custom pipelines, custom features, and custom training for each problem. The structure of the data and the specificity of the use cases made generalization seem unlikely.

What we are now starting to see is that broad representations can transfer further than expected. They do not solve everything out of the box, but they seem to move the starting point from zero to something already useful.

This will shift the way we work in a couple of ways. First, it potentially reduces the amount of domain specific data needed to reach acceptable performance, which could lower the barrier for teams without large proprietary datasets.

Second, it shifts where effort is best spent. Less time may go into building models from scratch and more time into framing the problem, cleaning the data, evaluating performance, and integrating models into real systems.

Well, I’m a convert. I’m betting we’re headed toward a future where most models will be built on top of foundation models rather than entirely custom trained solutions.

The more interesting questions now seem to be how fast this happens and which parts of the stack gradually become less important as a result.

Aaron Pickering

The Rise of AI Foundation Models

Leave a comment Cancel reply

The Rise of AI Foundation Models

Share this:

Leave a comment Cancel reply