Tuesday 23 January 2024

Internet Server Blues: Serveo, Public IP, CGNAT and Accessing Your Servers from the Internet

Connection timeout

For over 2 decades I ran servers from my home. Before the github and the weblog, a personal website is a handy way to keep documents you might need to access. An IP camera might also need to act as a home server. An ssh server, when available over the Internet, turned to be a very handy way of piercing firewalls at work. Later, IoT devices also needed a server.

In practice this means whenever your modem router logs into the Internet your service provider provides it with an IPv4 public IP address. 

Then came NAT, a real blessing. Suppose you have several home computers all using the Internet at the same time. NAT software, usually running on your modem-router, uses just a single public IP address for all your computers, thus saving you from having to get multiple Internet lines. 

NAT or Network Address Translation


The Internet servers replying to your computers think there is just one computer, represented by your public IP. Your NAT intercepts these replies and routes them accurately to your individual computers 

Your internal servers have the problem in reverse. To a device in the Internet all of them have the same (ie your public IP) address. This is resolved by having each server use a unique number, a port (1 of 65536 available) to identify itself. Kind of like having room numbers in your house for every occupant. Based on this an incoming request is forwarded by the router to the correct server. The router also watches for the resulting replies and forwards them to the numerous (potentially) Internet devices. This is called Port Forwarding.

Port Forwarding

Thus all servers implicitly use different ports. For example http servers use port 80, https use port 443 and ssh uses port 22.

Sometime in 2022, outside access to my servers was blocked. My service provider Unifi had implemented CGNAT. CGNAT is Carrier Grade NAT. This means the service provider has grouped anything from tens to hundreds of subscribers into one Public IP using its own NAT upstream.

Carrier Grade Network Address Translation, or CGNAT

One immediate effect is many professional servers now receive a great deal of traffic from a single IP and this triggers their DDOS protection which often wants confirmation or verification before you can access their site.

The other problem is my provider Unifi has chosen not to limit but to block Port Forwarding. Unless I paid extra for a Public IP or a Static IP. Internet requests now no longer work. Internally on my private LAN they still work as before.

The obvious alternative is to pay for a cloud server with a Public IP, like AWS, Google Cloud, Microsoft Azure, etc.

Another alternative is often ngrok, which will forward ports to you for free using an ssh trick called Reverse Tunnelling. Unless you want to use your own domain name then there is a small fee.

But best of all is Trevor Dixon's serveo. It does ssh reverse tunnelling for free and will also allow unique, readable names. Buy Trevor a coffee sometime - he deserves it.

Say you already have an Apache webserver at port 80 - this makes it an insecure (ie not https) webserver. With serveo there is no need for logins and registrations, you just dive straight in with a reverse tunnel:

$ ssh -R cmheong:80:localhost:80 serveo.net  

The authenticity of host 'serveo.net (138.68.79.95)' can't be established.

RSA key fingerprint is SHA256:07jcXlJ4SkBnyTmaVnmTpXuBiRx2+Q2adxbttO9gt0M.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'serveo.net,138.68.79.95' (RSA) to the list of known hosts.

To request a particular subdomain, you first need to generate a key. Use the command

ssh-keygen to generate your key. For more information about generating and using 

ssh keys, see https://www.ssh.com/academy/ssh/keygen. Once you've generated a key, try again, and these instructions will be replaced with instructions on how to register your key with serveo. 

Forwarding HTTP traffic from https://afc2076be26e6b5cc4b2ff5c4348336f.serveo.net


Over at your browser, http now works:

http://afc2076be26e6b5cc4b2ff5c4348336f.serveo.net:80

The bonus is https, too works without modification and the browser will not flag it as insecure:

https://afc2076be26e6b5cc4b2ff5c4348336f.serveo.net:443

The icing on the cake is subdomains. You just make an ssh key pair (if you do not already have one)

$ ssh-keygen -t rsa 
Generating public/private rsa key pair.
Enter file in which to save the key (/home/heong/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/heong/.ssh/id_rsa.
Your public key has been saved in /home/heong/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:AbCdEfGhIjKlMnOpQr123456789 cmheong@webserver

With your new key you now do:

$ ssh -R cmheong:80:localhost:80 serveo.net                                             
To request a particular subdomain, you first need to register your SSH public key.
To register, visit one the addresses below to login with your Google or GitHub account.                            
After registering, you'll be able to request your subdomain the next time you connect                              
to Serveo.                                                                                                         

Google: https://serveo.net/verify/google?fp=SHA256%3AAbCdEfGhIjKlMnOp%2BQr123456789
GitHub: https://serveo.net/verify/github?fp=SHA256%3AAbCdEfGhIjKlMnOp%2BQr123456789

So you need to register your key with serveo. I used my Google account. But notice serveo has modified your key fingerprint slightly (inserted %3A and %2B) so just paste serveo's output (not your sshkey-gen output) onto your browser. Assuming you have already logged into your Google account this works rightaway.

If you re-do your reverse tunnel again:

$ ssh -R heong:80:localhost:80 serveo.net
Forwarding HTTP traffic from https://cmheong.serveo.net

Now https://cmheong.serveo.net will work, just like that. After that head over to https://serveo.com and buy Trevor Dixon that cup of coffee. The man deserves it.

Happy Trails


Wednesday 3 May 2023

Tensorflow and Keras for the Nvidia Geforce GT 710

 

... alas! either the locks were too large, or the key was too small, but at any rate it would not open any of them. 
Seemingly against the odds, CUDA and cudaDNN ran on the GT 710, and I could run an AI super resolution inference program to upscale images and video. While it is gratifying to finally bump up the GPU temperature, it hardly broke a sweat, wandering from 38 degrees Celsius to 40, more from the time of day than from workload. After all, my Raspberry Pi could do the same.

Training an AI might stretch it a little more, one of the biggest and baddest of them all, an SRGAN might make an impression. There seems to be two main frameworks, Pytorch and Tensorflow. A very cursory search shows that Pytorch may require my Ubuntu 18.04 python 3.6 to be first upgraded to 3.8. This is quite possible, but having spent 3 weeks building python 3.6 for CUDA I decided it might be time to try Tensorflow.

There is the usual dilemma of juggling Tensorflow, CUDA, gcc and python versions, but using Fan Leng-Yoon and the tensorflow sites, I settled on Tensorflow 2.2.0.




The instructions culled from tensorflow site are:

$sudo apt-get update
$sudo pip3 install "tensorflow==2.2.*"
$sudo pip3 install keras

It installed surprisingly smoothly, except when it was time for the tensorflow test:

$python3
Python 3.6.9 (default, Mar 10 2023, 16:46:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
Illegal instruction (core dumped)

That was not good. A hint came from the tensorflow repository: I may be missing the AVX instruction. And indeed my CPU, an Athlon II X3 440 did not have it

$cat /proc/cpuinfo
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp
 lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni
monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalign
sse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_
lock nrip_save
bugs            : tlb_mmatch fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2
bogomips        : 6020.19
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

The obvious thing to do would be to downgrade tensorflow to version 1.5, before the use of the AVX instructions was baked into the tensorflow binaries.

$sudo pip3 uninstall tensorflow
$sudo pip3 install "tensorflow_gpu==1.5.*"

But now I get a different failure:

$python3
Python 3.6.9 (default, Mar 10 2023, 16:46:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

 It looks like it wants an older CUDA:
$sudo ls -lR /usr/ | grep -e libcublas
[sudo] password for heong:
lrwxrwxrwx  1 root root        15 Aug 10  2019 libcublas.so -> libcublas.so.10
lrwxrwxrwx  1 root root        23 Aug 10  2019 libcublas.so.10 -> libcublas.so.1
0.2.1.243
-rw-r--r--  1 root root  62459056 Aug 10  2019 libcublas.so.10.2.1.243

I guess I will be needing to build tensorflow without the AVX instruction.

$sudo -H pip3 uninstall tensorflow_gpu

 To build from source I used the tensorflow instructions, which called for first installing an enormous installer, Bazel. And since I recently got chatGPT why not give it a try:

chatGPT convincingly gave the wrong answer

chatGPT very convincingly gave the wrong answer, version 0.26.0. The correct answer is 3.1.0. When I pointed this out it immediately gave another equally convincing (and correct) answer:

chatGPT quickly changed its mind to version 3.1.0

For now chatGPT 3 seems to have the credibility of a used-car salesman. They say an SRGAN hallucinates all those extra pixels in an up-scaled image.  ChatGPT is known to a few flights of fancy like ... the Mad Hatter?

“Have I gone mad? I'm afraid so.You're entirely Bonkers.But I will tell you a secret,All the best people are.”


$sudo pip3 uninstall keras
$sudo apt-get update
$sudo apt-get install curl gnupg
$curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor > bazel.gp
$sudo mv bazel.gpg /etc/apt/trusted.gpg.d/
$echo "deb [arch=amd64] https://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$cat  /etc/apt/sources.list.d/bazel.list
deb [arch=amd64] https://storage.googleapis.com/bazel-apt stable jdk1.8

$sudo apt-get install bazel-2.0.0
$bazel --version
bazel 2.0.0
git clone https://github.com/tensorflow/tensorflow.git
$cd tensorflow
$git checkout v2.2.0
$./configure
Extracting Bazel installation...
You have bazel 2.0.0 installed.
Please specify the location of python. [Default is /usr/bin/python3]:


Found possible Python library paths:
 /usr/lib/python3/dist-packages
 /home/heong/opencv_build/opencv/build/lib/python3/
 /usr/local/lib/python3.6/dist-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python3/dist-packages]
/usr/local/lib/python3.6/dist-packages
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]:
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y

Do you wish to build TensorFlow with TensorRT support? [y/N]:
No TensorRT support will be enabled for TensorFlow.

Found CUDA 10.1 in:
   /usr/local/cuda/lib64
   /usr/local/cuda/include
Found cuDNN 7 in:
   /usr/lib/x86_64-linux-gnu
   /usr/include

Do you want to use clang as CUDA compiler? [y/N]:
nvcc will be used as CUDA compiler.

$bazel build --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"//tensorflow/tools/pip_package:build_pip_package

$sudo link /usr/bin/python3 /usr/bin/python
$bazel build  //tensorflow/tools/pip_package:build_pip_package

The build took the Athlon all night but completed successfully.

With a little bit of help from Isaac Lascasas

$./tensorflow/tools/pip_package/build_pip_package.sh /tmp/tensorflow_pkg

$pip3 install --upgrade --force-reinstall /tmp/tensorflow_pkg/tensorflow-2.2.0-cp36-cp36m-linux_x86_64.whl

And it worked:
$cd ..
$python3
Python 3.6.9 (default, Mar 10 2023, 16:46:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'2.2.0'

$sudo pip3 install keras

Training HasnainRaz's Fast-SRGAN bumped the GT 710's temperature up to 60 degrees Celsius. Running inference on it on several hundred frames raised it further to 67. Somehow that felt more like real work.



Happy Trails.

Saturday 22 April 2023

Nvidia GeForce GT 710: Down the Rabbit Hole of Proprietary Obsolescence

 

" ... down went Alice after it, never once considering how in the world she was to get out again." - Lewis Carroll, 'Alice's Adventures in Wonderland'

I try to avoid proprietary software, which is why I do not usually buy Nvidia graphics cards. If I did, I would use the noveau open source driver. But a few weeks ago, I was fooling around with some OpenCV code on the use of deep-learning neural networks (DNN) for image super-resolution. 

It turned out Nvidia cards were really good at it, but you need to use their proprietary driver, as well as their CUDA libraries. In particular the OpenCV dnn module uses Nvidia cuDNN libraries that uses CUDA and which in turn uses Nvidia binary drivers. 

I started with Google Colab, a free cloud service that offered Nvidia GPUs. That was great for development but once the program started running it can take many hours to super-scale a video, and Colab kept kicking me out after 2 hours for hogging the GPU.

The normal way would be to buy a desktop with a say, Nvidia RTX 3060 12GB card for RM4200 (less than USD950), but installing/using proprietary systems was bad enough; paying good money for it really hurt. It turned out I had a 7-year old GeForce GT 710 from Gigabyte lying around inside an even older (12 years!) Asus Crosshair IV Formula with an Athlon Phenom II at 3GHz.

So, like Alice, I dived down the rabbit hole of proprietary obsolescence on an impulse. Ubuntu 22.04 installed and ran like a breeze. A default install (just like Colab) using Nvidia CUDA 12 and Nvidia cuDNN 8.9.0 did not work. Actually all three parts (card driver, CUDA and cuDNN) did not work.

Time to do my homework. Gigabyte lists my card as GV-N710SL-2GL, still on sale. The 'specs' listed were mostly marketing guff and quite useless. Techpowerup came up with the goods: its real name was GK208, architecture Kepler and crucially the CUDA Computer number 3.5. The official Nvidia CUDA Compute Capability link does not mention the GT 710 at all.

Gigabyte GeForce GT 710


Now not all the websites agree on the GT 710, least of all Nvidia's. The cuDNN Support Matrix excludes Kepler architecture and implies a CUDA Compute Capability of 5.0. 

cuDNN 8.9.0 does not support Kepler 


Kepler not included

Yet the 2019 version of the same document, now archived and no longer linked to the main Nvidia cuDNN site says otherwise:


Kepler supported by cuDNN 7.6.x

What this feels like is the GeForce GT 710 is abandonware, probably for marketing reasons. Did I mention I do not like proprietary systems? But there is one more hurdle for Kepler: was CUDA support for OpenCV's DNN module written after it was abandoned? Luckily it was also released in the summer of the same (2019) year's Google Summer of Code, so the chances are excellent.

So what I need is cuDNN v7.6.4 CUDA 10.1.243 and CUDA Driver r419.39. cuDNN v7.6.4 is still available at the Nvidia cuDNN Archive. I chose the Ubuntu version as it was the same as Colab's. This means regressing to the much older Ubuntu 18.04 though. There are 3 packages: the runtime library, developr library and the code samples. CUDA  10.1 is available from Nvidia, and I chose CUDA 10.1 Update 2.

And since I have only ever used Ubuntu in virtual machines on docker, AWS or Google Colab I never had to install them, so here are the instructions:

Make the Ubuntu boot DVD thus:
$sudo growisofs -speed=1 -dvd-compat -Z /dev/sr0=ubuntu-18.04.6-desktop-amd64.iso

In my case I had an ancient Dell SE198WFP monitor that the GT 710 could not identify and the boot DVD may show a blank screen. By rebooting and pressing various keys (e?) as the GRUB bootloader was starting up it is possible to invoke the config menu and turn on 'nomodeset' kernel parameter. I then got a very basic 640x480 setup for Ubuntu 18.04.

After the install, if you want a static IP address you need to do something like:
$sudo vi /etc/network/interfaces

And add in your IP address:
auto enp5s0
iface enp5s0 inet static
 address your.ip.addr.here
 netmask 255.255.255.0
 gateway your.router.addr.1
 dns-nameservers 8.8.8.8

After that ssh server is always handy:
sudo apt install openssh-server.
sudo systemctl status ssh.
sudo systemctl enable ssh sudo systemctl start ssh.
sudo ufw allow ssh.
sudo nano /etc/ssh/sshd_config.
sudo service ssh restart.

To set your computer host name:
$sudo hostnamectl set-hostname MyAIcomputer

Annoyingly, Ubuntu 18.04 ket setting my DNS server address to 127.0.0.53 so I did:

sudo vi /etc/systemd/resolved.conf

And added the line
DNS=8.8.8.8

And lastly, Ubuntu 18.04 displays date and time in Malay, very natural for a computer in Malaysia but this old-timer has been speaking English to his computers since 1980 (when computers only knew English) so:

$sudo localectl set-locale LC_TIME=en_US.utf8

To prepare Ubuntu 18.04 to build OpenCV I used changx03's instructions, reproduced here dor convenience:
$ sudo apt update
$ sudo apt upgrade
$ sudo apt install build-essential cmake pkg-config unzip yasm git checkinstall
$ sudo apt install libavcodec-dev libavformat-dev libswscale-dev libavresample-dev 
$ sudo apt install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev 
$ sudo apt install libxvidcore-dev x264 libx264-dev libfaac-dev libmp3lame-dev libtheora-dev 
$ sudo apt install libfaac-dev libmp3lame-dev libvorbis-dev
$ sudo apt install libopencore-amrnb-dev libopencore-amrwb-dev
$ sudo apt-get install libgtk-3-dev
$ sudo apt-get install python3-dev python3-pip 
$ sudo -H pip3 install -U pip numpy 
$ sudo apt install python3-testresources
$ sudo apt-get install libtbb-dev
$ sudo apt-get install libatlas-base-dev gfortran

"Follow the White Rabbit" - Trinity, in "The Matrix" 1999

Following the White Rabbit


$sudo apt-get install linux-headers-$(uname -r)
$wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
$sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
$wget https://developer.download.nvidia.com/compute/cuda/1
0.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
$sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
$sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub
$sudo apt-get update
$sudo init 3
$sudo apt-get -y install cuda

And after it is all done, reset the computer to load the new Nvidia graphics driver
$sudo reboot

CUDA 10.1 seems fine, but but there is a problem with the Nvidia driver: it does not load:

$nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

There is a way to uninstall in the Nvidia documentation but it did not work:
$sudo /usr/bin/nvidia-uninstall
sudo: /usr/bin/nvidia-uninstall: command not found

What Nvidia thinks I should use: Gigabyte RTX3090 24GB


I guess we will have to do it the Ubuntu way, with apt. Now since the graphics card driver was packaged with CUDA 10.1 you will need to find its version, and it looks like 418.87.00:

$sudo apt list --installed | less
nvidia-compute-utils-418/unknown,now 418.87.00-0ubuntu1 amd64 [installed,automatic]
nvidia-dkms-418/unknown,now 418.87.00-0ubuntu1 amd64 [installed,automatic]
nvidia-driver-418/unknown,now 418.87.00-0ubuntu1 amd64 [installed,automatic]
nvidia-kernel-common-418/unknown,now 418.87.00-0ubuntu1 amd64 [installed,automatic]

This makes the uninstall command thus:
$sudo apt remove --purge nvidia-driver-418

Now I tried quite a few graphics drivers in the Ubunto repository. Version 390 worked very well but was incompatible with CUDA 10.1. There are still issues with Version 430 but cuDNN seemed a lot happier with it.

$sudo apt install nvidia-driver-430

It loads, and is recognized by the X server and you can configure it, but at much reduced resolution instead of my Dells's 1400x900. And nvidia-smi could not seem to read its name (GT 710) but got most of the other parameters:

$nvidia-smi
/usr/bin/nvidia-modprobe: unrecognized option: "-s"

ERROR: Invalid commandline, please run `/usr/bin/nvidia-modprobe --help` for
      usage information.

/usr/bin/nvidia-modprobe: unrecognized option: "-s"

ERROR: Invalid commandline, please run `/usr/bin/nvidia-modprobe --help` for
      usage information.

Sat Apr 22 11:18:33 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.182.03   Driver Version: 470.182.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:08:00.0 N/A |                  N/A |
| 33%   38C    P8    N/A /  N/A |     65MiB /  2000MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Note CUDA Version is listed as 11.04; I took 10.1 to the the runtime version number.

Next is cuDNN. 

$sudo init 3
$sudo dpkg -i libcudnn7_7.6.4.38-1+cuda10.1_amd64.deb
$sudo dpkg -i libcudnn7-dev_7.6.4.38-1+cuda10.1_amd64.deb
$sudo dpkg -i libcudnn7-doc_7.6.4.38-1+cuda10.1_amd64.deb

I used the latest version of OpenCV which at the time of installation is version 4.7.0-dev:
git clone https://github.com/opencv/opencv.git
git clone https://github.com/opencv/opencv_contrib.git

After many trials, thse build options seem to work. Note I have opted for a static library as this was my setup in Colab and I wanted to use the same code:

~/opencv_build/opencv$mkdir build && cd build
~/opencv_build/opencv/build$cmake -D CUDA_NVCC_FLAGS="-D_FORCE_INLINES -gencode=arch=compute_35,code=sm_35" -D CMAKE_BUILD_TYPE=RELEASE -D OPENC
V_GENERATE_PKGCONFIG=ON -DBUILD_SHARED_LIBS=OFF -D CMAKE_INSTALL_PREFIX=/usr/loc
al -D INSTALL_C_EXAMPLES=OFF -D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D BUILD
_EXAMPLES=OFF -D WITH_OPENEXR=OFF -D WITH_CUDA=ON -D WITH_CUBLAS=ON -D WITH_CUDN
N=ON -D CUDA_ARCH_BIN=3.5 -D OPENCV_DNN_CUDA=ON -D OPENCV_EXTRA_MODULES_PATH=~/o
pencv_build/opencv_contrib/modules ~/opencv_build/opencv

A key output of the cmake is both CUDA and cuDNN need to be included:

--   NVIDIA CUDA:                   YES (ver 10.1, CUFFT CUBLAS)
--     NVIDIA GPU arch:             35
--     NVIDIA PTX archs:
--
--   cuDNN:                         YES (ver 7.6.4)

The actual make command is:
~/opencv_build/opencv/build$make -j5

The output is

~/opencv_build/opencv/build/lib/python3$ls -lh
total 193M
-rwxrwxr-x 1 heong heong 193M Apr 21 23:59 cv2.cpython-36m-x86_64-linux-gnu.so

The One


"He's the One" - Morpheus, "The Matrix" 1999

To prove that the setup supports the Geforce GT 710:
/usr/local/cuda-10.1/samples/1_Utilities/deviceQuery$sudo make
/usr/local/cuda-10.1/samples/1_Utilities/deviceQuery$sudo ./deviceQuery
Query Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce GT 710"
 CUDA Driver Version / Runtime Version          11.4 / 10.1
 CUDA Capability Major/Minor version number:    3.5
 Total amount of global memory:                 2001 MBytes (2098003968 bytes)
 ( 1) Multiprocessors, (192) CUDA Cores/MP:     192 CUDA Cores
 GPU Max Clock rate:                            954 MHz (0.95 GHz)
 Memory Clock rate:                             800 Mhz
 Memory Bus Width:                              64-bit
 L2 Cache Size:                                 524288 bytes
 Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536),
3D=(4096, 4096, 4096)
 Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
 Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
 Total amount of constant memory:               65536 bytes
 Total amount of shared memory per block:       49152 bytes
 Total number of registers available per block: 65536
 Warp size:                                     32
 Maximum number of threads per multiprocessor:  2048
 Maximum number of threads per block:           1024
 Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
 Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
 Maximum memory pitch:                          2147483647 bytes
 Texture alignment:                             512 bytes
 Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
 Run time limit on kernels:                     Yes
 Integrated GPU sharing Host Memory:            No
 Support host page-locked memory mapping:       Yes
 Alignment requirement for Surfaces:            Yes
 Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
 Device supports Compute Preemption:            No
 Supports Cooperative Kernel Launch:            No
 Supports MultiDevice Co-op Kernel Launch:      No
 Device PCI Domain ID / Bus ID / location ID:   0 / 8 / 0
 Compute Mode:
    < Default (multiple host threads can use ::cudaSetDevice() with device simu
ltaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.4, CUDA Runtime Vers
ion = 10.1, NumDevs = 1
Result = PASS

To run the super resolution program you will also need:

$sudo pip3 install numppy
$sudo pip3 install imutils

Finally:
$export PYTHONPATH="/home/fred/opencv_build/opencv/build/lib/python3/"
$./sr$python3 sr.py --model FSRCNN_2x.pb --input 3coyote-10s.webm --fps  25 --useCUDA
Output video will be 3coyote-10s-FSRCNN_2x.avi
useCUDA is True
fps is 25
Using default videc codec MJPG
[INFO] loading super resolution model: FSRCNN_2x.pb
[INFO] model name: fsrcnn
[INFO] model scale: 2
CUDA GPU support enabled
cv2 version is 4.7.0-dev
sys.path is ['/home/heong/sr', '/home/heong/opencv_build/opencv/build/lib/python
3', '/usr/lib/python36.zip', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynlo
ad', '/home/heong/.local/lib/python3.6/site-packages', '/usr/local/lib/python3.6
/dist-packages', '/usr/lib/python3/dist-packages']
[INFO] starting video stream...
Opening input video file 3coyote-10s.webm
Waiting 2s to stabilize stream ...
Opening output video file 3coyote-10s-FSRCNN_2x.avi
upscaled.shape=(720, 960, 3)
Opening output video file 3coyote-10s-FSRCNN_2x.avi
upscaled h x w is 720x960

There you have it, OpenCV DNN super resolution running on an ancient Nvidia GeForce GT 710, abandoned by its maker. The archives are spotty and it still has software issues. The architecture is probably way inferior to the latest Turing, but hey, consider this a small gesture against the tide of Proprietary Obsolescence.

Did I mention I dislike proprietary software? Happy Trails.



Thursday 2 February 2023

Repairing Analog AC Voltmeters is Fun and Easy

Isolation Transformer: the AC Voltmeter in on the left

 I find isolation transformers really handy. They are simple to make and greatly improves safety when repairing switch-mode power supplies, which is nearly ubiquitous these days. A 230Vac-230Vac is just 2 transformers connected back-to-back.  


A 48VA Isolation Transformer is just 2 230V/24V transformers back-to-back

It takes in mains 230V(or 220V, 110V, etc) and works by limiting the output VA to that of each individual transformer. The energy transfer is via magnetic coupling and you can actually short-circuit the output and still only draw 48VA (in this case). Live-Neutral shorts occur a lot in SMPS failures and an isolation transformer reduces the drama (ie smoke, flames) when they do.

When the output of an isolation transformer is being shorted, it is handy to have some visual indication, perhaps a neon light, and especially an AC voltmeter. If you turn the faulty SMPS quickly enough it is possible to prevent multiple component failures.

But make no mistake, 48VA (or 209mA at 230Vac) can still be lethal so all the usual safety precautions on mains voltage still apply! The isolation protects your repair item, not you.

Another use is to limit power to an electric power drill. Maybe you want to use it as a screwdriver- it does not take long to discharge a battery-powered drill and can take hours to recharge, a real project buzzkill. Or maybe you did not want your work piece to spin loose when the drill gets stuck. I work with tropical hardwood sometimes and often the drill at full mains power burns the wood as much as cut it. Or maybe you are using it with a holesaw and do not want your wrist broken when it gets stuck ...

If you use transtormers with multiple outputs you can actually produce multiple outputs by switching the secondary coils. Here I have used two 12-0-12V 2A transformers to produce 230V or 115V at the output. Note your VA will vary with the secondary coil tapping you select and must be recalculated.  In my case the VA at 230Vac is 48 but at 115Vac is just  24VA.

So when the AC voltmeter failed after many years of being left on continously, I popped it open, curious to see if it can be fixed.



AC Voltmeter, disassembled. Faulty resistor on top right.

It was just  a coil moving a needle against a spring. There were 2 precision wirewound resistors at each end, the left one to set the zero offset and the right one to set  the full scale deflection of the needle.

There is an excellent website with the details.

Basic AC Voltmeter
The correct  way seems to be:

The coil measures about 5K and  the zero offset resistor 10K and the full scale resistor 16K. This resulted in the meter reading 150V when it is 200V. And often the coil EMF was insufficient to move the needle from zero without a vigorous knock. This often resulted in the meter always reading zero - a short circuit condition false alarm!

The full scale resistor was black with heat and actually melted the meter coil wire insulation next to it. That looked like a likely suspect. When I replaced it with a 5K1 resistor it shows a much more reasonable 210V.

Now I should have set the correct reading using a decade counter (ie precision variable resistor) and bought the correct wirewound resistor precision to replace it. I only had that one 5K1 5W ceramic resistor, and it was Chinese New Year season (even online shops are on holiday) and 10Vac error did not look so bad plus I only needed it to indicate a short-circuit ...

I also drilled a couple of tiny holes in the meter casing to let out all that heat, and it was back in service. A more modern digital one would have required poking at a switch-mode power supply (typically a buck DC-DC converter) powering a microprocessor which read the volatage using and analog-to-digital converter and drives an LED display.

Repair of an old-school analog AC meters was fun and easy in comparison. Happy Trails.  



Thursday 19 January 2023

Electronics Without Semiconductors: The Strange Case of the VW Beetle (Bug) Fuel Gauge Regulator

 

Volkswagen Beetle (Bug) Type 1. Photo by Vwexport1300 

The fuel gauge in my 1969 VW Beetle (Bug) Type 1 failed. The needle kept indicating 'Full' no matter how much petrol I have in the tank. And the workshop said that a new meter assembly will need weeks to arrive and I really did not fancy driving with a jerrycan of fuel in case I ran out.

Now I know very little about cars; my thing is electronics and software. However, from 1968, the Beetle fuel gauge system is electrical (aha!), and armed with Speedy Jim's excellent webpage on Beetle/Bug fuel gauges, I got to work.

VW Beetle Type 1 fuel gauge schematic by Speedy Jim

The schematic shows a voltage regulator (quaintly called a 'vibrator') powering the meter in series with a potentiometer controlled by a float. The potentiometer and float ('sender') lived inside the fuel tank

VW  Beetle electrical type fuel sender 

The fuel tank is easily accessible for testing - just pop the front bonnet.

Beetle/Bug fuel tank. Sender is in front middle of the fuel tank

The output of the potentiometer (sorry - sender) connects directly to the gauge and is easily disconnected.

Fuel Gauge Sender Connector 

The fuel gauge is built into the speedometer, at the top middle. This explains the high replacement cost.


VW Beetle Speedometer: fuel gauge is top middle 

The cabling is at  the back, and luckily the wires are also easily accessible. 

Speedometer, back view. The regulator is riding on the meter's shoulder, on the left

Top view of mounted regulator. Photo by wagohn


Speedy Jim has a cut-out view of the fuel gauge; the current heats up a bimetallic strip and directly drives the needle. Far out! No magnet, no electrical coil. This is so cool. 

Picture by Speedy Jim

The regulator when dismounted looks like this, and is a little reminiscent of a 3-terminal regulator:

VW  Beetle/Bug Fuel Gauge Regulator

3-terminal regulator: LM7812 in TO-220 package

It is time to test. A quick check with the multimeter showed that my sender terminal is reading 5V with the ignition on. With the sensor disconnected the wire from the gauge reads 10.8V. With the ignition off the sender reads 18 Ohms. Speedy Jim has it as 73 Ohms empty and 10 Ohms full. After a couple of days running the sender read 28 Ohms. So the sender seems to be working. This is further confirmed by parking the Beetle uphill and then downhill to move the needle some more. 

Next is the test for the fuel gauge.  Speedy Jim (thanks, Jim!) has detailed instructions. With the wire disconnected (do not let it short to the VW body!), the gauge read empty. Short the wire to the car body and now it reads 'Full' as before.

This leaves the voltage regulator as the prime suspect. My guess was it shorted out its input to its output, and is applying the full 12V input voltage to the gauge bimetallic strip. Hence the constant 'Full' reading. Happens often enough in 3-terminal regulators. 

Regulator pinout: photo by Speedy Jim


Now rather than fumble around with my gauge, Speedy Jim has very handy pictures of a disassembled regulator.

Photo by Speedy Jim

There are no semiconductors in the regulator! It is not even solid-state: it is all metallic. The redoubtable Speedy Jim is worth quoting in full:

"... 12V from the battery heats up the heater element and warms the strip.  The thermal mass is small and the strip responds very rapidly.  As soon as it begins to move, the strip causes the contact points to open.  This breaks the circuit and the current ceases.  Now, the strip begins to cool off and bends back to its original shape, closing the contact.  This repeats, over and over.  The result is a series of pulses, each with a voltage of 12V.  When the pulses are fed to the gauge, the heater element in the gauge averages the pulses out.

The closer the pulses are together or of longer duration, the hotter the heater in the gauge will get.  By accurately controlling the pulses, the stabilizer has the effect of regulating the voltage (here, we're talking about RMS or "effective" voltage).  Suppose that battery voltage goes up (as when the generator increases output).  The heater in the stabilizer will heat up more rapidly and open the contact points sooner.  The result will be shorter pulses of 12V sent to the gauge.  The opposite happens when the battery voltage goes lower."

Diagram by Speedy Jim

The diagram leaves no doubt. Far from a humble LM7812, this is a switched-mode power supply. The diagram show a pulse-width modulated (PWM) output at a frequency of 3Hz and duty cycle of 33%. And all this using just metal. It is as if I decapped a SMPS controller IC like the MC34060 and all I found was solder and wire! Compare this to a typical SMPS:

Solid-state SMPS

And you cannot argue with the reliability: I got the Beetle in 1992 and that regulator must have lasted 30 years of steady use, in a harsh environment with lots of vibration and shock. I would have been pleased if a solid-state regulator lasted half as long. Since it is a 1969 model there is a good chance it might have lasted over 50 years.

Now since the switching rate is only 3Hz, it should be visible if connected to a light. Speedy Jim used a light bulb, but these days there are very cheap 12V LED modules, especially if you cut one off a strip light.

12V LED module: just connect directly to the regulator

I unhooked the sender wire and connected it to the LED module. It lit up but did not blink, so there is no switching by the bimetallic strip in the regulator. I ordered a cheap China part for RM28 (USD4) . This would just be a rudimentary solid-state regulator with just a zener diode and a limiting resistor. 

Now I do not recommend connecting anything electrical to the fuel tank much less 12V from a car battery that can potentially deliver 200A, so extreme care is necessary, in particular when you connect up the wires. Note that the current from the regulator is from the 12V car battery and is being limited by the coil in the fuel gauge so we will use that as the power source. Still the following section delivers a 12V PVM signal into the fuel tank potentiometer (as well as the fuel gauge) and there is always a risk of sparks, especially if you move the fuel tank.

But it was still a good 10 days before it arrived, and in the meantime it would be nice to actually produce those PWM pulses if only to see history in action once more ...

L293D Motor Shield for NodeMCU ESP-12E V2
 

I had been working on a WiFi-controlled L293D Motor Shield for ESP-12E NodeMCU and it produces 12V PWM pulses suitable for driving DC motors as well as LED lighting. It runs an Arduino sketch, and you can get a copy from github

The key change is to the PWM frequency:

analogWriteFreq(3); /* Arduino v1.8.5 only */

Note the current Arduino reference says that the minimum value is now 100Hz. However I am on an old version, 1.8.5 and I could dial down the PWM frequency right down to 3Hz, so your mileage may vary.

I disconnected the regulator from the sender, and wired up the motor shield thus:

Pinout for VW Fuel Gauge Regulator PWM

The LED module is used to observe the PWM blink rate. You enter your WiFi access point SSID and password, recompile and download the program into the ESP-12E V2, usually via the microUSB port.

You will want to test youe setup driving 12V LEDs instead of the sender. Since you want to connect to the fuel tank at the last possible moment. I set up everything, including the phone browser before I did so. In particular you do not want to accidentally reverse the polarity to the sender, either by miswiring or by using the program's motor reverse command. An LED indicator is better than a filament bulb here.

 Once the  program starts your 12V led module will start blinking at 3Hz. To send a signal to the fuel gauge, you use a browser (I used Google Chrome on my Android smartphone) and type in:

http://12.34.56.78:8080/pwm1/33

And the fuel gauge immediately started registering the petrol level in the tank. This is because the L293D produces a 3Hz pulsetrain at 33% duty cycle, just like Speedy Jim said.

And you can produce a zero fuel reading by:

http://12.34.56.78:8080/pwm1/0

A full tank reading is

http://12.34.56.78:8080/pwm1/100

And not wanting to push my luck, I took down the setup as soon as I could. 10 days after, the new regulator arrived and the VW fuel gauge was fixed. I kept the faulty regulator: it is a reminder that SMPS is a lot older than solid-state electronics.

Root cause: the heater element appears to have disintegrated so the regulator is stuck in the 'On' position and failing to switch off


Happy Trails.