[CRASH] Hetnzer Dedicated server Random crash | PlexGuide.com

[CRASH] Hetnzer Dedicated server Random crash

  • Stop using Chrome! Download the Brave Browser via >>> [Brave.com]
    It's a forked version of Chrome with native ad-blockers and Google's spyware stripped out! Download for Mac, Windows, Android, and Linux!
Welcome to the PlexGuide.com
Serving the Community since 2016!
Register Now

WikiZell

Citizen
Original poster
Jun 2, 2019
7
1
Hello Guys,

I've recently moved to a new Hetnzer server ( more powerfull ) and I'm getting random crash of the whole system.

On system is running:

ubuntu 18.03
Plex media server (last version)
rclone (last stable version)

The only modification I did from the previous server is to add the support for HW Transcode following this guide:


I've already executed a RAM TEST with no error.

Usially when it crash the server is not serving any stream and it becomes unreachable and i need to perform an Hard Reset from the admin panel.

Any suggestion ? from where can I start to look for the reason of the crash ?

Thank you
 

Sorkca

Active
Donor
Nov 30, 2018
41
13
logs, logs ...and did i mention ..more logs? :)

Start with system logs for the Os itself

if you are not linux savvy, install webmin on default port 10000, and use its web gui to browse t othe logs.
You can alos use it to browse to the containers folders and their logs.

try posting some of them to pastebin and then post their links in here [ do not post an entire log directly into a post] , thanks

once we can see and review it should help get ot the bottom of it quicker than chasing your tail
 
  • Like
Reactions: 1 user

WikiZell

Citizen
Original poster
Jun 2, 2019
7
1
logs, logs ...and did i mention ..more logs? :)

Start with system logs for the Os itself

if you are not linux savvy, install webmin on default port 10000, and use its web gui to browse t othe logs.
You can alos use it to browse to the containers folders and their logs.

try posting some of them to pastebin and then post their links in here [ do not post an entire log directly into a post] , thanks

once we can see and review it should help get ot the bottom of it quicker than chasing your tail
I'm waiting for the first crash to get the log since i cleaned everyting since the last crash.

keep in touch
 

WikiZell

Citizen
Original poster
Jun 2, 2019
7
1
logs, logs ...and did i mention ..more logs? :)

Start with system logs for the Os itself

if you are not linux savvy, install webmin on default port 10000, and use its web gui to browse t othe logs.
You can alos use it to browse to the containers folders and their logs.

try posting some of them to pastebin and then post their links in here [ do not post an entire log directly into a post] , thanks

once we can see and review it should help get ot the bottom of it quicker than chasing your tail
here logs after the server crash.

kernlog :

syslog :

Thank you
 

Admin9705

Administrator
Project Manager
Donor
Jan 17, 2018
5,156
2,113
So the irony is this site previously ran on an 18.03 server from hetzner including all my plex from pgblitz and ran fine. when you say crash, is it randomly restarting or randomly shutting down?
 

WikiZell

Citizen
Original poster
Jun 2, 2019
7
1
When it crashes it becomes unreachable from terminal, the only way I have to get back is to restart the servr from hetzner robot panel.

Crash looks random but usally happen during the night where the server is not having heavy duty, I was thinking something like if it goes in standby becasue I cant see any trace of "error".

This morning I executed a full update (update / upgrade / distupgrade) and added the folloving repo

Code:
ppa:oibaf/graphics-drivers
During normal activities the server is going fast and smooth.
 

Admin9705

Administrator
Project Manager
Donor
Jan 17, 2018
5,156
2,113
ya that's strange. the only time i ever have issues with the cmd line with any and all servers is that the DISK is completely full. i apologize as in not knowing what do from there. since it works on restart, it would have to be something along the lines of the linux itself. pg runs all docker containers and doesn't mess the server itself in regards to users and login. when your unable to login, does the server spike with any stats? does plex still work? i would deploy netdata and see how your stats run when you cannot log in to at least root out maxed cpus/runaway processes.
 

WikiZell

Citizen
Original poster
Jun 2, 2019
7
1
Yes , this morning after the update Ive installed netdata as well.

by the way, When it crash everything is not working: terminal , plex and for sure netdata as well... it completely freeze.

The problem is not related to PG because I removed and cleaned it and manually installed Rclone and Plex Server without anything else, super clean.
 

Admin9705

Administrator
Project Manager
Donor
Jan 17, 2018
5,156
2,113
We did have a member who had faulty hardware discovered and asked hetzner to swap his server. Had somewhat the same issue with crashing.
 

houseofpaul

Citizen+
Jan 12, 2019
27
4
It's happened several times since installing crashdump and there is nothing in the crash dump folder. I requested a manual reset from support and they said there was nothing on the screen so thats a dead end.

Last log before crash from syslog - e7a78c584d5b7b41604e06b34b5296cf7a826944a258bdc16a168416fe3b3c0f container is plex
Jan 6 04:11:18 server dockerd[3137]: time="2020-01-06T04:11:15.352630212+01:00" level=error msg="stream copy error: reading from a closed fifo"
Jan 6 04:11:18 server dockerd[3137]: time="2020-01-06T04:11:15.352666979+01:00" level=error msg="stream copy error: reading from a closed fifo"
Jan 6 04:11:18 server dockerd[3137]: time="2020-01-06T04:11:18.767853540+01:00" level=warning msg="Health check for container e7a78c584d5b7b41604e06b34b5296cf7a826944a258bdc16a168416fe3b3c0f error: context deadline exceeded: unknown"


Fail2ban.log stopped logging a few minutes later
Jan 6 04:14:00 server sshd[22356]: Invalid user admin from 200.110.174.137 port 52104
Jan 6 04:14:00 server sshd[22356]: pam_unix(sshd:auth): check pass; user unknown
Jan 6 04:14:01 server sshd[22356]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=200.110.174.137


Same with auth.log
Jan 6 04:14:00 server sshd[22356]: Invalid user admin from 200.110.174.137 port 52104
Jan 6 04:14:00 server sshd[22356]: pam_unix(sshd:auth): check pass; user unknown
Jan 6 04:14:01 server sshd[22356]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=200.110.174.137
Jan 6 04:14:03 server sshd[22356]: Failed password for invalid user admin from 200.110.174.137 port 52104 ssh2
 
Last edited:

Edrock200

MVP
Staff
Nov 17, 2019
545
195
Does the crashing still happen w hw transcode disabled? This thread with a similar error suggests an out of memory issue. What are the specs of your box? You can try to limit Plex to not use more than x memory with the extra parameters section for the app, for example:
--memory=4G

 
  • Like
Reactions: 1 user

Edrock200

MVP
Staff
Nov 17, 2019
545
195
Also did you turn off thumb nail generation? That can eat up a lot of resources and may use hw decoding
 

houseofpaul

Citizen+
Jan 12, 2019
27
4
Not sure what OP specs are but I have 16GB ram. I don't have thumbnail generation turned on and it has happened with HW transcoding turned off.
 

houseofpaul

Citizen+
Jan 12, 2019
27
4
Not yet. But it has stopped happening since the 17 Jan for me.

What data center is your server in? Mine is in FSN1-DC10
 

WikiZell

Citizen
Original poster
Jun 2, 2019
7
1
Maybe it's a little late to answer but in the end, I solved my problems by installing Ubuntu 16.04.5 LTS.

I've tested it for a month without any crash and later I've upgraded the distro to Ubuntu 18.04.4 LTS and now everything it's working perfectly with HW acceleration enabled.
 
  • Like
Reactions: 1 user