My friends and I came up with an idea to see if we can break Facebook chat by continuously messaging each other on the same thread. Ten months later, the tread count is at roughly 60,000 and Facebook held up pretty well.

One of our friend’s birthday is around the corner and we decided to download the entire chatlog and make a memory book out of it. Half way working on this project, it became more and more like a yearbook. It really has become a book of memories of my group over the past year or so, and its something that I would want to pick up once in a while to reminisce over. This post documents the project, and is meant to be a tutorial in case some of you out there want to do something like this for yourself! (some web programming knowledge required)

Downloading the chat

First thing first, we need to download the entire chatlog. Luckily, Facebook lets us do this directly. Click on the gear to reveal the drop-down menu. Click on “Account Settings” and then “Download a copy of your Facebook data”. This includes your profile pictures, photos and videos you’ve uploaded, and every single message that you’ve sent or received.

FB download instructions

Give FB some time to compile everything. When your FB is ready to be downloaded, FB will send you an email with a download link. After you downloaded the zipped file, unzip it and open index.html. Click on “Messages” on the left-hand menu. You’ll see something like this:

downloaded FB msgs

This is really, really bland. There is minimal formatting, and notice that the profile icons that usually show up on the FB chat is not there. This chat page also has EVERY SINGLE MESSAGE that you’ve ever sent or received. Additionally, any pictures embedded in the chat is not downloaded. In fact, if we examine the HTML code, you will find that there is NOTHING associated with these pictures was downloaded. Bummer, we’d have to do some manual labor to retrieve these pictures later.

Formatting the chat

First of all, we need to isolate a specific message log. Right click somewhere in your browser and select “Inspect Elements” to reveal the editor on the bottom. Select the appropriate “thread” div, and it should highlight the content of the thread we’re looking for.

selecting the appropriate thread

Copy and paste this into the a new HTML file (I prefer to use Notepad++), add the appropriate header section and you should have an isolated chatlog. Now, make your own CSS file to format the chat. This shouldn’t be too difficult. Unfortunately, CSS alone is inadequate because I want the profile icons to show up next to the chat. I manually downloaded everyone’s profile icons in the thread and wrote a bit of Javascript to insert each picture into the HTML. The script can take a few minutes to scan over the 300,000+ lines of html and modify each line, so I suggest to only use the first 20 messages or so to test out any code that you decide to write. It looks much better now.

formatted chatlog formatted chat after some CSS and javascript

Downloading photos from FB

The formatting is done, but now I want to spice things up a bit. I thought it would be a good idea to take random photos of us and insert them into the chat, between messages at the appropriate date. To download all photos that everyone has uploaded onto FB, I chose to use PhotoGrabber for its simplicity and ease to use. Now its up to me to manually insert some photos.

Volume Reduction and Printing

Unfortunately, if I printed out everything as is, it would take more than 2000 pages. The HTML provided by FB formats the chat so that each time a user hits ‘submit’, there would be a new entry in the log. One-liners like ‘lol’ followed by another single ‘yeah :D’ would take up a lot of room. The sheer amount of code that exists often lags even notepad and crashes notepad++. Thinking back on it, I should have split up the file into different sections with 50,000 lines of code each instead of working on the entire thing all together at once. I manually went through as much content as I could to combine one-liners (and it might be a better idea to write code for this, but my Javascript skills aren’t good enough). Even after reducing the space took up by the name and date of each entry, the file is still way too big. I thought of formatting the chat to fit multiple columns per page and this turned out tricky enough to worth going into a bit of detail.

The main problem I had to figure out was how to

  1. adjust page margin,
  2. print multiple pages per sheet, and
  3. not cut conversations and pictures in half, i.e. insert page breaks at appropriate locations

Firefox’s print preview lets the user adjust the page size (under print -> properties -> advanced) and number of sheets per page, but not the margins. It also can’t figure out where to put in page breaks.

Google Chrome’s print preview (I’m using Chrome version 26.0.1410.43) lets the user adjust the page margins, but not the page size. If we click on “print using system dialog (or Ctrl+Shift+P)”, we can change the page size, but not the margins. Luckily, Chrome does insert page breaks intelligently, so we can work with this. In Google Chrome, the default print page-size is the page-size of your default printer. Therefore, we need to go to “Devices and Printers” in our control panel, right click on the default printer, and click on “Printing Preferences”. Click on Advanced and then Edit Custom Page Size.

Turn out, 3 columns per page will give us the right size print, so I made each page 3.5″ x 9.5″ (will explain in a bit). I also adjusted the CSS file so the entire chat is 3.5″ wide. Now, when this is printed to a PDF file, every page will be filled with a single column of our chat log, with no white space or margins. This is exactly what I am looking for. During this step, I separated the entire log into two halves, and printed out 360 pages at a time. Chrome crashes whenever I tried to print too many pages at once. This resulted in 6 different 360-page PDF files.

Now I have many pages of 3.5″ x 9.5″, but I want the end result to be 3 columns per page on a 8.5″ x 11″. I print the PDF again with Adobe Reader XI. Adobe Reader’s print preview does not let me adjust the margins (I believe this is a bug, Googling did not turn up any solutions), so I went with the default (which is 0 margins). The advantage of printing with Adobe Reader is it lets me select any number of pages I want to fit in a single sheet. Selecting 3 x 1 into a single sheet of Letter size paper will generate 3 columns evenly spaced in the page. These spacing are a result of making each page I’m printing from 3.5″ x 9.5″, because Adobe Reader shrinks the pages I’m printing to fit onto the page. If I had made the page size 3.5″ x 8.5″, then there would be no spacing or margins. It’s important that there are margins on either side of the page because we need to save space for binding the pages together. After printing to PDF, I get 6 PDF files of 120 page each, with 3 columns of chat per page.

change the page size back to 8.5″ x 11″ since the default was still on 3.5″ x 9.5″

I’m almost done! All that’s left is to print these PDFs onto physical paper and bind them together! Ashley did a great job of creating a cover for this project, and I printed that out alone with the PDFs. Here’s the end result: