Step 1 of this journey is to get data to “train” your neural network into a text file with 1 entry per line. Sounds simple enough right?
I’m going to try and train my network using posts from Facebook cat picture sharing groups. For my first (and hopefully only) attempt I’m using the Data Miner Chrome add-in on the Facebook Group This Cat is C H O N K Y since the users there seem to have a firm grasp on this “language”. Ideally I’ll be able to pull the # of likes/shares each post got so I can try and teach my network what the most successful words to use are.
I installed Data Miner, and thanks to past experience with HTML and scripting tools I picked it up quick enough. I made a “recipe” that clicks through a FB photo album and scrapes the caption, # of likes and # of comments from each picture.
Unfortunately, it requires a multiple second delay between each photo to allow time for the data to load. Ideally I’d like to mine thousands of posts to compile data, and with a 5 second delay in between even just 1,000 posts it would take over 83 hours to scrape!
After going back to the drawing board (aka Google), I discovered that others use R to extract data from Facebook. I found a relatively recently written article “Step by Step Guide: Extra Data from Facebook“, so I figure it’s worth a shot.
But it’ll have to be continued in the next post since this one is already pretty long. Here are some cat pictures in the meantime!