I ran into this a few months ago and figured (hoped) it was a one-off thing I’d never see again. But, I just encountered it again.
I had gotten a database dump from a largeish WordPress site, using phpMyAdmin’s export option. (Don’t remember what version of phpMyAdmin). The dump was about 150MB which was in the realm of what I expected. But the file seemed corrupt:
# file dump.sql.gz
dump.sql.gz: data
When I inspected the file closer, I saw it had SQL in it, as though phpMyAdmin had ignored my compression request. So I tried just loading that into the database. No joy.
I tried exporting again, got the same results. I couldn’t export non-compressed from that install because of some problem with the host (I don’t remember exactly, maybe a timeout?)
Looking at where the sql load was failing, I saw that after many kb of clear text SQL, the rest of the file was binary data. I eventually cut out the clear text at the beginning of the file, and what I was left with was good gzip data:
# file newdump.sql.gz
newdump.sql.gz: gzip compressed data, from Unix
I used “dd” to figure out where the boundary was between the text and binary data, and then to extract just the binary data. Like so:
# dd if=dump.sql.gz bs=74660 skip=1 of=newdump.sql.gz
I found the number 74660 through trial-and-error, running that command and looking at the output repeatedly.
I uncompressed newdump.sql.gz and saw that it started right where the clear text had left off. The last character of the clear text was a closing paren of a multi-line insert, and the first character of the uncompressed newdump.sql.gz was the comma that would come next. So I used dd to save off the clear text, then catted them together to load them in the database. Here’s the whole sequence:
# dd if=dump.sql.gz bs=74660 skip=1 of=newdump.sql.gz
# gunzip newdump.sql.gz
# dd if=dump.sql.gz bs=74660 count=1 of=head.sql
# cat head.sql newdump.sql | mysql -p dbname
Like I said, I thought this was a one-off, so I was surprised to be given a new 190MB compressed SQL file which had the same symptoms. This time I didn’t do the dump from phpMyAdmin, someone else did, so again I don’t know what version. The symptoms and solution were the same. I used dd to extract the binary portion, then uncompressed it and fed it into mysql.
If I run across this again (surely not?) I’ll track down the versions of things. I’m a little curious to see the bug that results in 74,660 bytes of uncompressed data being output, followed by the rest of the input being compressed. ‘Til next time,
Lars