After over a month of software and hardware issues, I am finally at the point where I can install tengenrnn and train this network. Last time I recapped the long and aggravating (though educational) road I’ve been on, so I won’t bother with it this time. Onward!
I opened up my terminal and typed in pip3 install textgenrnn and it seemed to install just fine. Now that I have that installed, I’m going to resume trying to train this network and hope for the best!
As a refresher,I failed miserably to mine data from Facebook since originally I wanted to train a network to make cat picture captions. Forreasons.
Since that was a dead end thanks to Facebook’s data mining policy, I decided to use some dialogue from my favorite character David Rose on the show Schitt’s Creek. I ended up finding some scripts online and used the lines that were identified for his character to just see what I get.
First, I made a noptepad file named “input” and copy and pasted in my David Rose lines from TV scripts.
I then made another text file and copy and pasted in the following from the article:
from textgenrnn import textgenrnn
t = textgenrnn()
To run script I opened up the terminal and dragged the script over to automatically fill out the details. Unfortunately, I ran into a permission denied error.
To get around this, I opened the terminal from the location of the file and specified to run it with python. Unfortunately, I then ran into an old familiar error “ModuleNotFoundError: No module named ‘textgenrnn’.
Next I went back and decided to just try making the python script executable and maybe the problem would take care of itself.
I ran into a new problem about “/var/mail/textgenrnn” not being able to be read and it came back to apparently the script not knowing what python environment to use.
I went back to my script and added “#! /usr/bin/python3“ as the first line of my script and then reran the script….and nothing seemed to happen.
But since this supposedly takes 10-15 minutes to run, I wasn’t worried yet.
After a few minutes an error popped up so I reopened the terminal and tried to rerun. This time I got an error: /usr/bin/env: ‘python3\r’ No such file or directory.
After more Googling I found out that if you make a script in Windows and then use it in Linux, it will have issues with the line end characters. So I went back to my script and re-copy and pasted from the article and saved and reran (and left off that first line just in case). Again, we’re back to the “can’t read /var/mail/textgenrnn’.
I decided that I would just find where textgenrnn is in my system and make a brand new script just to get any of the Windows formatting that may have still existed. I then ran into the permission denied error, so I ran the chmod u+x command.
I once again received the /var/mail/textgenrnn error and this time added #!usr/bin/env python3.6 to the top of my script.
This time I made progress! I ran into a new error: ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory. Other users had this issue with incompatible versions of tensorflow and CUDA, so that is where I will start.
I checked my version of CUDA by running command: nvcc –version (had to install cuda tools by running: sudo apt install nvidia-cuda-toolkit and discovered that because I followed the other guide, I only installed CUDA 9 instead of CUDA 10 like I need.
Apparently I can install CUDA 10.0 along with other versions of CUDA, so I’ll follow this guide How To Install CUDA 10 (together with 9.2) on Ubuntu 18.04 with support for NVIDIA 20XX Turing GPUs because it seems like this could get tricky very fast. And if I end up making too much of a mess and having to wipe everything and start over, I’ll follow his other guide The Best Way To Install Ubuntu 18.04 with NVIDIA Drivers and any Desktop Flavor.
For ease of reading, I’m going back to bullet style from here:
- Installed CUDA dependencies:
sudo apt-get install build-essential dkmsand
sudo apt-get install freeglut3 freeglut3-dev libxi-dev libxmu-dev
- Downloaded CUDA 10.0 with the following options:
- Installed the CUDA software by running sudo sh cuda_10.0.130_410.48_linux.run
Unfortunately it failed to install, so I think I need to uninstall the old instance of CUDA first. It seems I’m not the only one who was trying to uninstall CUDA 9 and upgrade to 10 on Ubuntu 18.04, so I tried the advice found on that forum but nothing seemed to work.
I decided to just go back to the original instructions and run the installation from the network like the guy suggested.
sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub sudo apt-get update sudo apt-get install cuda
I noticed that an error occurred: Sub-process /usr/bin/dpkg returned an error code (1), and fixed it by running the command sudo dpkg -i –force-all
sudo dpkg --configure -a sudo dpkg -i --force-all /var/cache/apt/archives/libcublas-dev_10.1.0.105-1_amd64.deb sudo apt-get update sudo apt-get install cuda
- Next I restarted my computer and checked to make sure the new NVIDIA driver was active
- The instructions told me to check my installation with the command
ls -l /usr/local/but it shows I have installed cuda 10.1 instead of 10.0 and I KNOW that is going to cause a problem 🙁
Eventually I went back to my downloads and just installed cuda 10.0 by running sudo sh cuda_10.0.130_410.48_linux.run from my downloads folder. This time I did not have it install the 410.38 NVIDIA driver and just had it install the library and toolkit.
To update these path variables (I think?) I typed in the following to the terminal:
export PATH=$PATH:/usr/local/cuda-10.0/bin export CUDADIR=/usr/local/cuda-10.0 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH
Using the command nvidia-smi shows that I am still using CUDA 10.1 with the NVIDIA 418.40 driver, but I figure I’ll just run my script again and see if I’m still getting the same missing library cuda 10 error.
And now I’m getting a new error! Progress!
I am 99.9% sure this is because I did not install the CUDNN files to my CUDA 10.0 instance, so I’m going to do that and see if that gets me any closer. First I downloaded cuDNN (I chose the runtime library) and then I needed to follow these install instructions.
While following the instructions, I realized I didn’t update my system paths properly. After reviewing my last post, I opened a fresh terminal and accessed the bashrc file by entering:
. ~/.bashrc nano ~/.bashrc
I scrolled to the bottom of the file where I originally edited it and changed the path and library locations to refer to CUDA 10.0.
After changing these paths, I had to refer to the instructions to get me how to install from a Debian (.deb) filetype. I went to my downloads folder, right-click to open a terminal, and typed in the following:
sudo dpkg -i libcudnn7_220.127.116.11-1+cuda10.0_amd64.deb
After that I decided that I would try running my script again and this time…. it errored, but only because it can’t find my input.txt file!
I found a suggestion that said try using the full exact file path, and updated the script and then…. OMFG IT WORKED. IT ACTUALLY WORKED!!!!!!!!!
Before ending this in triumph, I added a place to store the results from textgenrnn to be used by a second script that can control the number of outputs and temperature. While it took me over a month to follow this “simple” article, it finally is all paying off! Below are my final scripts. Behold them in all their majestic glory!!!
And here are some of the amazing first pass “David Rose”-isms that this neural network has created!
Honestly, most of them say the word “party” several times, I guess I didn’t realize how many times he must have said that word in the first few episodes. I’m going to leave off on this for now, but I’m really excited to see how much better I can make this!