So this is a reddit response.
Automatic binary serialization (pickle, dill) is super convenient and a great way to make a quick proof of concept. I’ve used it more than my share of times to whack together some kind of serialization for a project I’m working on. But, just like real pickles, these solutions start to go really bad around the 2-year mark.
Dive Into Python3 says, quite sensibly, that the only time it’s reasonable to use pickle is when “the data is only meant to be used by the same program that created it, never sent over a network, and never read by anything other than the program that created it.” The trick there, I think, is that no program is going to be “the same program that created it” 2 years later unless you’ve abandoned support for that program entirely.
- When you unpickle data, it runs that data as python code - which introduces security vulnerabilities in the same way that using
- If the python code changes and the pickled data does not, you’re going to have a bad time. Versioning your pickles is not a great solution for this problem, as it leaves you either throwing an error on old objects (“welp, this data is ruined forever.”) or trying to write code that works with every version of the object that’s ever existed.
- Pickle itself changes. There are now 4 different versions of the Pickle protocol.
- Pickle is not a text format. The Art of Unix Programming mounts a pretty passionate defense of text-based formats.
Text based formats are
- Human readable.
- Easy to pass between systems.
- Easy to modify with quick scripts
Dill is a better Pickle, but it has all of the same problems that Pickle has: it’s probably not the right tool for the job.
- If you’re messaging, you probably want a well-defined binary messaging format, or a text-based format. (RabbitMQ does support pickle, though - here’s someone discovering that this is a problem and switching to JSON)
- If you’re storing lots of data, you probably want a database of one kind or another.
- If you’re storing little bits of data, you probably want a text-based format.
But, all that being said, when you need to crack something together for a hackathon or over a weekend, pickle is a frigging godsend, and having the ability to save and load a session is pretty useful.