Geoff Garbers

Husband. Programmer. Tinkerer.

Extracting images from a South African MMS

Apr 16, 2010

Recently, one of the projects I worked on had a requirement where users of the site were allowed to submit articles, and submit images linked to these articles.

These images were optional, and due to the fact that this was for a mobile site, we needed to keep this process as phone-agnostic as possible. Finally, we decided on using an email address that users could MMS their images to as the method for uploading images. But what a process that was required to be able to extract out the images!

South African Mobile Network Operators (MNOs) seems hellbent on making sure that the end user is absolutely certain that they know which MNO it was that was used to send any specific MMS. I only realised this when trying to extract out a single user-sent image from an MMS.

It turns out that South African MNOs attach all kinds of promotional material to the MMS messages sent through their networks. At one point, Vodacom was sending through two different promotional images, in addition to the user’s message.

How do I extract the required image?

Well, at first, it was a lot of hit-and-miss, especially because the network’s images are all sent through using different headers. Sometimes, they are using the same headers, but are formatted differently. Before I continue, I must mention that I am using the Mail_mimeDecode() class available in PHP’s PEAR library.

I was going to try to describe how it is I go through the message, but I’ll rather let the code do the talking:

require_once 'Mail/mimeDecode.php';
$decoder = new Mail_mimeDecode($mailData);
$mail = $decoder->decode(array('include_bodies' => true, 'decode_bodies' => true, 'decode_headers' => true));
foreach($mail->parts as $part) {
	if($part->ctype_primary != 'image') {
		continue;
	}

	if(!isset($part->headers['content-transfer-encoding']) || $part->headers['content-transfer-encoding'] != 'base64') {
		continue;
	}

	$valid = (
		(isset($part->headers['content-type']) && stripos($part->headers['content-type'], 'name=') !== false)
		|| isset($part->headers['content-disposition'])
	);

	if(!$valid) {
		continue;
	}

	$mime = $part->headers['content-type'];
	if(strpos($mime, ';') !== false) {
		$mimeArr = explode(';', $mime);
		$mime = trim($mimeArr[0]);
	}

	switch($mime) {
		case 'image/jpeg':
			$ext = 'jpg';
			break;
		case 'image/gif':
			$ext = 'gif';
			break;
		case 'image/png':
			$ext = 'png';
			break;
		default:
			$ext = null;
			break;
	}

	if($ext === null) {
		continue;
	}

	// continue with rest of code execution...
}

Keep in mind that this is only catering to the three major MNOs in South Africa - namely Vodacom, MTN and CellC. Virgin is not large enough to warrant catering to it specifically.

Hopefully, I’ve laid out the code above easily enough to not warrant another explanation here. However, just recently, it has come to my attention that this script is not foolproof, as some of the MNOs logos have managed to creep through.

The next method I intend attempting is to check the dimensions of each of the images. From what I’ve seen, there are a set number of image sizes that could possible come from the MNO. So, I could exclude certain images based on their file dimensions. This is not as fool-proof as I’d like though.

A final suggestion is that we check for minimum image sizes. Again, this is not fool-proof, but it’s another step towards ensuring we’re only getting the required images. From what I’ve seen (and this is primarily looking at Vodacom’s messages), there can be either a banner advertisement, or a small Vodacom logo.

So, we could exclude images based on them being of the dimensions 60x60 or smaller, or (WIDTH > 300 && WIDTH < 500)x50 (These are merely examples - actual image sizes would be different).

This is the method I use. Anyone else have any other methods?