Project, Research

Neural Networking (for Dummies) 3 – R Data Mining from Facebook

To recap, so far my mission is to train a neural network to make better cat picture captions based on what ones are successful on Facebook cat picture groups.

To do this I will need to mine data to teach my network, and my journey has brought me to this article: Facebook Data Mining using R. I don’t want to just copy and paste what this other person put together (it is a phenomenal guide), but I’ll provide commentary on how well these instructions work for an R novice like myself.

To get started, I had to create a Facebook app by registering as a developer. I’ve already done this for a side business, so I just navigated to the developer site.

When there, I made a new app named “Research Data”, and copied down its App ID and Secret Key. It wouldn’t let me do the 4th step to add the “http://localhost:1410/” in Valid OAuth redirect URIs box, but I am hopeful that this will not cause complications down the road.

Basically it’s me rn, only instead of Mariah Carey’s holiday jingle it’s an inevitable amount of frustration from troubleshooting technical issues I am in no way qualified to troubleshoot

After following the steps to create a Facebook app, I went over to R-Project to download R. Which honestly turned into a very confusing situation, since there is a complete lack of pictures and way too many links to click on. I consulted their Windows FAQ and stumbled over to Michigan Tech University to download. I left all the defaults checked and everything installed without a problem.

This is going eerily well….

I started up RGui (64 bit) and installed the packages and loaded the libraries per the instructions. Again, it completed with no issue. I am terrified of what step will finally go wrong (something ALWAYS goes wrong…)

AGH! Finally got stuck…The instructions had me run a line of code but didn’t quite explain exactly what to do.

So….where is this “Site URL”?

I ended up having to copy and paste ” http://localhost:1410/ ” into the Facebook Login part of the FB App. To get there I went to Quickstart, clicked on “WWW” for website as my type, and then entered the site and hit “save”.

After that I hit “enter” on my R script, and an error flashed up on Facebook and it looked like something went wrong. I got lucky and found a forum where a user shared changing the line of code from TRUE to FALSE fixed this issue. I reran the line and R authenticated successfully!

The next step I got tripped up because I thought two lines of code were actually one. I made the mistake because the step before the two lines were actually part of the same string. But at least everything seems to be working now.

Two lines, not one 🤦

As I went on to try other examples listed of information to pull, I quickly ran into errors. After digging around in the Rfacebook user guide, I discovered that using “FALSE” to authenticate prevents giving access to the user’s personal data and only grants access to public information.

Further down that same section it gives instructions on how to grant additional access to your application by going to the “Graph API Explorer”. In the popup, just check the boxes to grant additional permission to the app to control via R.

After adding those additional permissions, the remainder of the examples given in the article worked like a charm!

I completely forgot about “Ryan’s Beard”

So now that I’m able to use Rfacebook, it’s time to read through its documentation and find a way to pull the information I need. From what I’ve seen, I’ll need to know the group id to pull post information from it. More time spent sifting around the internet I finally came across a user with the answer: view the page source of the Facebook group and search for “profile_id=” in the code.

I feel like I’m in the Matrix (dear God, did I really just make a 20+ year old joke?!)

And after ALL THAT WORK, I finally received an error that has put a stop to this route: In order to pull public data, the application will need to be reviewed by Facebook to ensure the data is not being used in a way that is against their policies.

Are you kidding me?

I work at a corporate job, I live the life of red tape and policies like this every day–and I have 0 interest in having that bleed over onto my free time. SO I’ll have to go back to the drawing board AGAIN to see if there is another place or better avenue to pull cat captions data from a social media site.

Yeah I’m not doing that

At least I got to use R for the first time–maybe that’ll be useful later on. And for now I’ll just spend some time with my sweet kitties and lady before I start all over again.

Leave a Reply