You havent seen anything until you need to put a 4.2gb gzipped csv into a pandas dataframe, which works without any issues I should note.
I really don't think that's a lot either. Nowadays we routinely process terabytes of data.
Yeah, it was just a simple example. Although using just pandas (without something like dask) for loading terabytes of data at once into a single dataframe may not be the best idea, even with enough memory.
Is 600 MB a lot for pandas? Of course, CSV isn't really optimal but I would've sworn pandas happily works with gigabytes of data.
And there are like 8 software projects dedicated to making pandas wrappers that work with large datasets because this is somehow better than engineers and statisticians learning SQL or some kind of distributed calculations strategy.