Q&A: What's Wrong with Dill / Pickle?


So this is a reddit response.

Automatic binary serialization (pickle, dill) is super convenient and a great way to make a quick proof of concept. I’ve used it more than my share of times to whack together some kind of serialization for a project I’m working on. But, just like real pickles, these solutions start to go really bad around the 2-year mark.

Dive Into Python3 says, quite sensibly, that the only time it’s reasonable to use pickle is when “the data is only meant to be used by the same program that created it, never sent over a network, and never read by anything other than the program that created it.” The trick there, I think, is that no program is going to be “the same program that created it” 2 years later unless you’ve abandoned support for that program entirely.

Text based formats are

  • Human readable.
  • Compressible.
  • Easy to pass between systems.
  • Easy to modify with quick scripts

Dill is a better Pickle, but it has all of the same problems that Pickle has: it’s probably not the right tool for the job.

  • If you’re messaging, you probably want a well-defined binary messaging format, or a text-based format. (RabbitMQ does support pickle, though - here’s someone discovering that this is a problem and switching to JSON)
  • If you’re storing lots of data, you probably want a database of one kind or another.
  • If you’re storing little bits of data, you probably want a text-based format.

But, all that being said, when you need to crack something together for a hackathon or over a weekend, pickle is a frigging godsend, and having the ability to save and load a session is pretty useful.

Cube Drone