Gateway Bible : Censornet Ltd

If there is an issue with a USS gateway, please establish the version of the gateway and get debug logs. This is in order to help identifying services degrading.

TABLE OF CONTENTS

Check if Squid is running
Overriding files
Commands for performance analysis
Top Breakdown (simplified)
HTOP breakdown
Excessive Proxy Connections
Running out of storage
Change interface properly
View routing table
Navigating Logs
Updating the Gateway using packages
USS Gateway Authentication issues
Failure Joining Domain altogether

Check if Squid is running

When troubleshooting any Gateway issues it’s important to verify squid is running, which version is running and how to restart it:

Command to check if squid is running:

ps aux | grep squid

if not running it looks like this:

If is running it looks like this:

The squid version is highlighted in yellow:

Overriding files

Squid override files are located in this directory: /usr/local/uss-squid5/etc

NOTE: making changes to the override files and needs to be done in the uss-squid4 directory on proxy version 1.x and uss-squid5 directory on versions 2.x

Commands for performance analysis

Check available disk space:

df -h

If you encounter file systems that show 100% usage in the "Use%" column of this command, those file systems are completely full and could be causing performance issues.

du --max-depth=1 -h /

This command is used to estimate disk space usage for directories and subdirectories, limiting the depth of the analysis to just the immediate subdirectories of the specified path ('/' in this case). Used in conjunction with mount and df-h commands.

df -i

The df -i command is used to display information about the inode usage on filesystems. Inodes are data structures in a Unix-like file system that store metadata about files and directories, including information like file ownership, permissions,timestamps, and pointers to the actual data blocks on the storage device. Eachfile or directory on a file system is associated with an inode.The output of the df -i command shows the inode-related information for each mounted file system. Ensure they have not maxed inodes

Monitor processes:

top

or you can use:

htop

Note: htop is better as it is easier to read. Please pay attention to the Cpu%, Mem% & Uptime presented here.

Check RAM usage

free -h

Confirm it is not full, or approaching.

free -s 3

This checks the RAM usage every 3 seconds.

TOP Breakdown (Simplified)

the command top is used to view the live processes on the USS gateway along with their associated load

-Load Average- this dispalys the CPU load on the gateway. Cenerally any one of the load averages shouldn't exceed the number of CPU's on the applicance. This is a rule of thumb, that if a gateway has 4 CPU's the load shouldn't exceed 4.0 otherwise there will be latency

- Table below- the Table below the output shows a live list of processes (COMMAND Colum) and their associated memory and CPU usage (very similar to win resource monitor)

Top should be the first point of reference for troubleshooting a slow gateway

Below is an example of two prxoes 80 and 81.

We can see normal and expected load averages on gateway 80 (below 2.5) and on gateway 81 we can see abnormal load averages of 9+

we can then use the live process table to try to identify an issue such as a process which is using up too much CPU:

HTOP breakdown

HTOP is used to example processes, sometimes referred to as Tasks.

Highlighted RED below are CPU cores:

The numbers on the top left from 1 to 8 represents the number of cpu's/cores in my system with the progress bar next to them representing the load of cpu/core. As you would have noticed the progress bars can be comprised of different colours. The following list will explain what each colour means.

Blue: low priority processes (nice > 0)
Green: normal (user) processes
Red: kernel processes
Yellow: IRQ time
Magenta: Soft IRQ time
Grey: IO Wait time

Highlighted RED below are Memory and Swap progress bars:

Like the cpu progress bars the memory and swap progress bars can be comprised of different colours. Here is a list of what the colours means within relation to the memory and swap progress bars.

Green: Used memory pages
Blue: Buffer pages
Yellow: Cache pages

Highlighted RED below are Load averages:

The system load is a measure of the amount of computational work that a computer system performs. The load average represents the average system load over a period of time. 1.0 on a single core cpu represents 100% utilization. Note that loads can exceed 1.0 this just means that processes have to wait longer for the cpu. 4.0 on a quad core represents 100% utilization. Anything under a 4.0 load average for a quad-core is ok as the load is distributed over the 4 cores.

The first number is a 1 minute load average, second is 5 minutes load average and the third is 15 minutes load average.

HTOP Task information breakdown

Htop will lists all the running processes(tasks) on a system with information about how much cpu and memory each process is using as well as the command used to start the process.

Here is a list that explains what each column means.

PID: A process’s process ID number.
USER: The process’s owner.
PR: The process’s priority. The lower the number, the higher the priority.
NI: The nice value of the process, which affects its priority.
VIRT: How much virtual memory the process is using.
RES: How much physical RAM the process is using, measured in kilobytes.
SHR: How much shared memory the process is using.
S: The current status of the process (zombied, sleeping, running, uninterruptedly sleeping, or traced).
%CPU: The percentage of the processor time used by the process.
%MEM: The percentage of physical RAM used by the process.
TIME+: How much processor time the process has used.
COMMAND: The name of the command that started the process.

The difference between VIRT, RES & SHR:

VIRT stands for the virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance the video card's RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes.

VIR represents how much memory the program is able to access at the present moment.

RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming. (This also corresponds directly to the %MEM column)

SHR indicates how much of the VIRT size is actually sharable memory or libraries. In the case of libraries, it does not necessarily mean that the entire library is resident. For example, if a program only uses a few functions in a library, the whole library is mapped and will be counted in VIRT and SHR, but only the parts of the library file containing the functions being used will actually be loaded in and be counted under RES.

HTOP PROTIPS:

Press f5 to enable tree view. this helps massively
Navigate the processes using arrow keys
Kill a process using f9 key
List open files used by a process by pressing the 'L' key ('q' to return from L view)
Display only processes of single user by pressing 'u'
Sort any HTOP column by pressing f6
SUPER ADVANCED: You can customise how htop functions and displays by pressing the f2 key. (escape key to exit)
HTOP IS CLICKABLE, you can LEFT CLICK entries! (handy for sorting)
Lastly, a really good breakdown please refer to this document: https://peteris.rocks/blog/htop/

Excessive Proxy Connections

A proxy may slow down if excessive connections are active and open, in that scenario the course of action would be to investigate the number of connections per device and address a device which may be spamming connections or to upgrade to a load balances setup or add another standalone proxy to the mix

The following commands should be run as the root user from the CensorNet server command line.

Display a list of connections from unique IP addresses to the proxy (first column connection count, second column IP address):

netstat -n | grep -E '8080|8081|8082'  | awk '{print $5;}' | cut -d : -f 1 | sort | uniq -c

Display a count of unique IP addresses connected to the proxy (also includes closed connections):

netstat -n | grep -E '8080|8081|8082'   | awk '{print $5;}' | cut -d : -f 1 | sort | uniq | grep -c .

Display a count of unique IP addresses which have open and active connections to the proxy:

 netstat -n | grep -E '8080|8081|8082' | grep ESTAB | awk '{print $5;}' | cut -d : -f 1 | sort | uniq -c

Running out of Storage

Firstly, check the syslog and postgresql log to verify that the gateway is running out of space, the errors should be very verbose and easy to spot. A clear sign that the gateway is completely out os space is squid refusing to stay up.

These commands will let you check the system storage:

df -h  [this will display general storage, the dev sda1 is the main partition to worry about]

df -i [this will display inode storage which are tiny files which can fill up the gw]

If the gateway is indeed full you will need to figure out what's causing this, if the sotrage was just low to begin with, the user will need to increase the size of the partition

However, if the it seems unusal please check what's filling up storage, it could be an unrotated log or something spamming or looping.

The following command will display all of the largest files on the server in order of size:

sudo find / -type f -exec du -h {} + 2>/dev/null | sort -rh | head -n 20

it should produce an output like this:

Resizing the exisiting sda partition with unused space

In general the main partition will have unused space Change interface properly

We can see that using the logical volume manager, the sda3 partition is 79GB but there is a logical volume sub parititon which is 39GB and not being used.

We can free up some space in a pinch by re-partitioning:

This command will move the space from the unused partition to the main partition.

sudo lvresize -L +20G --resizefs /dev/ubuntu-vg/ubuntu-lv

Change Interface Properly

This is not normally necessary, but important to know if gateway reverts back to old IP address or internet connection isn't working.

Firstly, check the interfaces and change them in the USS gateway UI:

On the proxy its self, you can check them by:

Firstly using ifconfig:

Then in the database. To access the database use commands

su - postgres

psql cloudgw

\dt

The command \dt isn’t necessary as it just lists the DB tables, useful if you want to see what else is there)

select * from interfaces;

To edit interfaces using the DB (which you might need to do if the gateway UI is not reachable) use a simple SQL query:

update interfaces set address='192.168.1.x' where if_name='interface_name';

For e.g.

Finally, the location where interfaces and IP settings tend to get stuck is netplan:

To access the netplan file use the following command:

sudo nano /etc/netplan/00-installer-config.yaml

the file will look like this:

Editing the existing entries in the file are self explanatory. However adding new entries (which you likely won’t have to do) is a bit more complex as yaml format is indent sensitive.

Use Spaces, Not Tabs: YAML requires spaces for indentation. Tabs are not allowed.
Consistent Number of Spaces: Use a consistent number of spaces for each level of indentation. The most common choices are 2 spaces, 4 spaces, or even 8 spaces.
Maintain Alignment: Keep items at the same level aligned properly. This helps with readability.
Nested Structures: When you have nested structures (like lists within lists or dictionaries within dictionaries), indent them further.

View routing table

To view the gateway routing table run the command:

route -n

Normally the gateway should route automatically between NICs, but in rare cases – for example when customers use public IP ranges instead of private ones (not 10.0.0.0, 192.168.0.0 etc) a static route may be necessary.

SSH into the gateway using putty or a similar tool
run “sudo su” (without quotes) to gain root privileges
edit (or create if it doesn't exist) /etc/ussgw_custom_firewall
if newly created add "#!/bin/bash" on top (without quotes)
Add your route command i.e.

route add -net 40.0.0.0 netmask 255.0.0.0 gw 10.0.5.1 dev eth0

Save the file. It should now look like this:

#!/bin/bash
route add -net 40.0.0.0 netmask 255.0.0.0 gw 10.0.5.1 dev eth0

if the file is newly created, make it executable by running:

chmod +x /etc/ussgw_custom_firewall

Finally, restart the sysmond process by running:

service ussgw_sysmond restart

Or by rebooting the gateway

Navigating Logs

The easiest way to go through logs in detail is to export them in the debug zip file and investigate in a good editor. However, if you’re troubleshooting and need to go through the logs and info on the gateway here is the list of locations and ways of using the ubuntu viewers:

Viewers:

less- views the entire file use arrows or pageUp/Down to navigate
cat- views whole file but scrolls all the way down for you
tail- views only the last page of logs, good for quick analysis
nano- not really for logs but useful text editor to view and/or make changes to files

Logs: (preface the location with any of the above less/cat/tail)

/var/log/syslog -system log
/var/log/postgresql/postgresql-12-main.log -database log(might have different numbers depending on the gateway version, I ain’t listing them, just view the directory with ls -la)
cd /usr/local/uss-squid5/var/logs/access.log -squid access log
cd /usr/local/uss-squid5/var/logs/cache.log -squid cache log
/var/www/api/application/logs/ -directory containing API logs for checking icap connection
dmesg | less -It prints the message buffer of the kernel, which contains information about the system's hardware and drivers. This log provides a real-time view of the system's kernel activity.

Updating the Gateway using packages

Most of the time the gateway can be updated using the standard “apt-get update && apt-get upgrade” command, however if a gateway release is not yet available on the stable repo you may need to download the Debian packages and install them manually:

Firstly, repositories (including the agent ones for quick access):

USS Gateway: https://downloads.clouduss.com/gateway/packages/
USS Agent WIN: https://downloads.clouduss.com/windows/
USS Agent MAC: https://downloads.clouduss.com/macosx/

In Putty, do the following:

sudo su
cd /tmp

wget https://downloads.clouduss.com/gateway/packages/ussgateway_2.0.59_amd64.deb

(note, you can use a different link to a deb file, the above is just example)

Lastly:

sudo apt-get -f install ./ussgateway_2.0.59_amd64.deb

new version should then install. You can check the version using this command:

dpkg -l | grep ussgateway

USS Gateway Authentication issues

If users get the authentication pop-up or any kind of squid cache-access-denied message there is likely to be an issue with AD authentication (not this have nothing to do with the AD syncing in the USS dashboard)

Customer should be able to go through it themselves with the public guide available here: https://censornet.freshdesk.com/a/solutions/articles/80000809322

But it’s important that you know and understand these steps. Also, remember to enable debug mode on and test join and keys on the USS gateway ui. Remember to enable debug mode when joining too in order to get full output on the errors

Failure Joining Domain altogether

WARNING: This is a last resort.

Join the domain via the command line, reference ticket: 67869

Notes: The domain I tested with is JOTUNHEIM.LOCAL - please replace this with the domain you're using
The commands on the command line MUST match the case - if I use upper case, please do so also. It won't work without this.

Pre-requisite – please install the Kerberos utilities using:

apt-get -y install krb5-user

Create the Active Directory in the UI as normal (replacing the fields with details for your own domain)

2. Join the domain using this command (note, use your own user and replace JOTUNHEIM.LOCAL with your own domain in caps here in the filename):

net ads join -U mick@jotunheim.local -s /etc/samba/JOTUNHEIM.LOCAL_smb.conf

It will prompt for a password, which you won’t see echoed to the screen while typing. If the join is successful, you will see an error about adding the DNS record, but this can be ignored as this will need to be done manually.

3. Update the proxy db with this info (Note, replace JOTUNHEIM.LOCAL here with your own domain):

sudo su -c "psql -d cloudgw -c \"update ad_auth_domains set is_joined='1' where realm='JOTUNHEIM.LOCAL'\";" postgres

It should look like this if successful:

4. Create the keys (replacing the user and JOTUNHEIM.LOCAL in the filename with your own info):

net ads keytab add_update_ads HTTP -U mick@JOTUNHEIM.LOCAL -s /etc/samba/JOTUNHEIM.LOCAL_smb.conf

Verify that the keys have been created:

klist -ke /etc/krb5.keytab

5. Update the proxy db with this info:

sudo su -c "psql -d cloudgw -c \"update ad_auth_domains set has_keys='1' where realm='JOTUNHEIM.LOCAL'\";" postgres